(转载)硬解析和软解析情况下latch持有情况分析

http://www.dbsnake.net/hard-and-soft-parse-latch-analysis.html#comment-359

硬解析和软解析情况下latch持有情况分析

 

为了方便阐述,这篇文章里我用了如下一些术语:

1、用“library cache lock”代表名为library cache lock的enqueue;

2、用“library cache pin”代表名为library cache pin的enqueue;

3、用“Child library cache”代表名为library cache的子latch;

4、用“Child shared pool”代表名为shared pool的子latch;

5、用“Child library cache lock”代表名为library cache lock的子latch;

6、用“Child library cache pin”代表名为library cache pin的子latch;

 

你可能需要先阅读如下的三篇文章才能看懂我在说什么:

shared pool latch和library cache latch

library cache lock和library cache pin到底是什么

library cache lock和library cache pin到底是什么(续)

 

另外需要注意的是:从11gR1开始,Oracle用mutex替换了各种跟library cache相关的latch,所以这篇文章的结论并不适用于11g。

 

我们首先来看library cache的经典架构:

 library_cache_architecture

 

其中我们可以看到在每一个library cache object handle上都存在4个链表,它们分别是lock owners,lock waiters,Pin owners,Pin waiters。所以实际上library cache object handle你可以把它看成是library cache locklibrary cache pin所保护的resource。而向lock ownerslock waiters里添加/修改一个library cache lock时是需要先持有Child library cache lock;向Pin ownersPin waiters里添加/修改一个library cache pin时是需要先持有Child library cache pin

无论是硬解析还是软解析,在扫描/修改library cache object handle链表的时候是需要先持有Child library cache;另外,硬解析还涉及到要在shared pool里分配内存,所以硬解析还会持有Child shared pool

 

好了,我们来分6步进一步细化和扩展上述结论,这6步考察latch持有情况的思路均为:同时开3个session,session 1中先持有所有的相关子latch;session 2在session 1持有完所有的相关子latch后去执行硬解析/软解析,这个时候session 2一定会hang住;同时我们在session 3中去做session 2的process dump,从session 3的dump文件中即可看到session 2中的latch持有与等待的情况。

 

好了,我们开始。

第一步:先来证明硬解析会在持有Child library cache且并不释放的情况下转而去持有Child shared pool

先分别到session 1、session 2、session 3里做一下准备工作:

Session 1:

SQL> select name,value from all_parameters where name=’_kghdsidx_count’;

 

NAME                                                                             VALUE

——————————————–     —————————————————————-

_kghdsidx_count                  1

 

SQL> select name,level#,child#,addr,gets from v$latch_children where name like ‘library cache%’ or name=’shared pool’;

 

NAME                                   LEVEL#     CHILD# ADDR           GETS

————————————————– ———- ———- ——– ———-

shared pool                                      7          1 04E5D308     783011

shared pool                                      7          2 04E5D370         26

shared pool                                      7          3 04E5D3D8         26

shared pool                                      7          4 04E5D440         26

shared pool                                      7          5 04E5D4A8         26

shared pool                                      7          6 04E5D510         26

shared pool                                      7          7 04E5D578         26

library cache                                     5          1 3213DE80     406403

library cache                                     5          2 3213DE18     322236

library cache                                     5          3 3213DDB0      96778

library cache lock                                 6          1 3213E0F0     107337

library cache lock                                 6          2 3213E088      86643

library cache lock                                 6          3 3213E020      28278

library cache pin                                  6          1 3213DFB8     204881

library cache pin                                  6          2 3213DF50     170659

library cache pin                                  6          3 3213DEE8      51722

library cache pin allocation                         3          1 3213E228        561

library cache pin allocation                         3          2 3213E1C0         38

library cache pin allocation                         3          3 3213E158        206

library cache lock allocation                        3          1 3213E360        722

library cache lock allocation                        3          2 3213E2F8         46

library cache lock allocation                        3          3 3213E290        229

library cache hash chains                          9          1 3213E498          0

library cache hash chains                          9          2 3213E430          0

library cache hash chains                          9          3 3213E3C8          0

 

25 rows selected

从上述结果可以看到,Child library cache的level为5,Child library cache lock和Child library cache pin的level为6,Child shared pool的level为7,所以已经具备了在不释放Child library cache的情况下进而去持有Child shared pool的条件。

 

SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’shared pool’ and child#=1;

 

‘ORADEBUGPOKE0X’||ADDR||’40X00

————————————-

oradebug poke 0x04E5D308 4 0×00000001

 

因为要测硬解析,这里我flush一下shared pool:

SQL> alter system flush shared_pool;

 

系统已更改。

 

Session 2:

SQL> select sid from v$mystat where rownum=1;

 

       SID

———-

       137

 

SQL> select spid from v$process where addr in (select paddr from v$session where sid=138);

 

SPID

————

3952

 

Session 3:

SQL> oradebug setospid 3952

Oracle pid: 24, Windows thread id: 3952, image: ORACLE.EXE (SHAD)

 

准备工作已经做完,我们现在去session 1里手工持有上述Child shared pool:

Session 1:

SQL> oradebug setmypid

已处理的语句

SQL> oradebug poke 0x04E5D308 4 0×00000001

BEFORE: [4E5D308, 4E5D30C) = 00000000

AFTER:  [4E5D308, 4E5D30C) = 00000001

 

然后到session 2中去随便执行一条sql,因为刚才在session 1里面flush了shared pool,所以这里一定是硬解析,并且这里一定会hang住:

Session 2

SQL> select * from scott.emp;

……这里hang住了

 

接着我们到session 3中去对session 2做一下process dump:

Session 3:

SQL> oradebug dump processstate 10

已处理的语句

 

SQL> oradebug tracefile_name

d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc

 

d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc的内容:

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_3952.trc

Sat May 05 01:13:57 2012

ORACLE V10.2.0.1.0 - Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 - type 586

Process Affinity    : 0x00000000

Memory (Avail/Total): Ph:1005M/2030M, Ph+PgF:2333M/3921M, VA:1273M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 24

 

Windows thread id: 3952, image: ORACLE.EXE (SHAD)

 

 

*** 2012-05-05 01:13:57.468

*** ACTION NAME:(Command Window - New) 2012-05-05 01:13:57.406

*** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:13:57.406

*** SERVICE NAME:(cuihua) 2012-05-05 01:13:57.406

*** SESSION ID:(137.75) 2012-05-05 01:13:57.406

Received ORADEBUG command 'dump processstate 10' from process Windows thread id: 2924, image:

===================================================

PROCESS STATE

-------------

Process global information:

     process: 33E4FB7C, call: 33F606A8, xact: 00000000, curses: 33F254F4, usrses: 33F254F4

  ----------------------------------------

  SO: 33E4FB7C, type: 2, owner: 00000000, flag: INIT/-/-/0x00

  (process) Oracle pid=24, calls cur/top: 33F606A8/33F606A8, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0

  (post info) last post received: 0 0 0

              last post received-location: No post

              last process to post me: none

              last post sent: 0 0 0

              last post sent-location: No post

              last process posted by me: none

    (latch info) wait_event=0 bits=20

        Location from where call was made: kghupr1: Chunk Header

        Context saved from call: 815704132

      waiting for 4e5d308 Child shared pool level=7 child#=1

        Location from where latch is held: kghupr1: Chunk Header

        Context saved from call: 737738784

        state=busy, wlstate=free

          waiters [orapid (seconds since: put on list, posted, alive check)]:

           10 (33, 1336151634, 33)

           20 (33, 1336151634, 33)

           24 (15, 1336151634, 15)

           waiter count=3

          gotten 857154 times wait, failed first 316 sleeps 24

          gotten 0 times nowait, failed: 0

      on wait list for 4e5d308

      holding    (efd=5) 3213de18 Child library cache level=5 child#=2

        Location from where latch is held: kglpndl: child: before processing: latch

        Context saved from call: 1

        state=busy, wlstate=free

    Process Group: DEFAULT, pseudo proc: 33E7F368

    O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 3952

    OSD pid info: Windows thread id: 3952, image: ORACLE.EXE (SHAD)

……省略显示其他内容

 

注意上述trace文件中有如下内容:

waiting for 4e5d308 Child shared pool level=7 child#=1

holding    (efd=5) 3213de18 Child library cache level=5 child#=2

所以我们已经证明了硬解析会在持有Child library cache且并不释放的情况下转而去持有Child shared pool。

 

第二步:我们来证明硬解析会去持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cacheChild library cache lock

先在session 1里手工持有所有的Child library cache lock:

Session 1:

SQL> alter system flush shared_pool;

 

系统已更改。

 

SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache lock’;

 

‘ORADEBUGPOKE0X’||ADDR||’40X00

————————————-

oradebug poke 0x3213E0F0 4 0×00000001

oradebug poke 0x3213E088 4 0×00000001

oradebug poke 0x3213E020 4 0×00000001

 

SQL> oradebug poke 0x3213E0F0 4 0×00000001

BEFORE: [3213E0F0, 3213E0F4) = 00000000

AFTER:  [3213E0F0, 3213E0F4) = 00000001

SQL> oradebug poke 0x3213E088 4 0x00000001

BEFORE: [3213E088, 3213E08C) = 00000000

AFTER:  [3213E088, 3213E08C) = 00000001

SQL> oradebug poke 0x3213E020 4 0x00000001

BEFORE: [3213E020, 3213E024) = 00000000

AFTER:  [3213E020, 3213E024) = 00000001

 

然后到session 2中去随便执行一条sql,因为刚才在session 1里面flush了shared pool,所以这里一定是硬解析,并且这里一定会hang住:

Session 2:

SQL> select * from scott.emp;

……这里hang住了

 

接着我们到session 3中去对session 2做一下process dump:

Session 3:

SQL> oradebug dump processstate 10

已处理的语句

 

SQL> oradebug tracefile_name

d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc

 

d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc的内容:

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 - type 586

Process Affinity    : 0x00000000

Memory (Avail/Total): Ph:1142M/2030M, Ph+PgF:2355M/3921M, VA:1345M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 0 <none>

 

Oracle process number: 0

 

Windows thread id: 3140, image: ORACLE.EXE (SHAD)

 

 

KCCDEBUG_LEVEL = 0

Dynamic strand is set to TRUE

Running with 2 shared and 18 private strand(s). Zero-copy redo is FALSE

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_3140.trc

Sat May 05 01:26:29 2012

ORACLE V10.2.0.1.0 - Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 - type 586

Process Affinity    : 0x00000000

Memory (Avail/Total): Ph:969M/2030M, Ph+PgF:2332M/3921M, VA:1267M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 31

 

Windows thread id: 3140, image: ORACLE.EXE (SHAD)

 

 

*** 2012-05-05 01:26:29.609

*** ACTION NAME:(Command Window - New) 2012-05-05 01:26:29.562

*** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:26:29.562

*** SERVICE NAME:(cuihua) 2012-05-05 01:26:29.562

*** SESSION ID:(153.27) 2012-05-05 01:26:29.562

Received ORADEBUG command 'dump processstate 10' from process Windows thread id: 2844, image:

===================================================

PROCESS STATE

-------------

Process global information:

     process: 33E5250C, call: 31A8ACBC, xact: 00000000, curses: 33F38174, usrses: 33F38174

  ----------------------------------------

  SO: 33E5250C, type: 2, owner: 00000000, flag: INIT/-/-/0x00

  (process) Oracle pid=31, calls cur/top: 31A8ACBC/31A8ACBC, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0

  (post info) last post received: 0 0 0

              last post received-location: No post

              last process to post me: none

              last post sent: 0 0 0

              last post sent-location: No post

              last process posted by me: none

    (latch info) wait_event=0 bits=0

        Location from where call was made: kgllkdl: child: cleanup: latch

        Context saved from call: 1

      waiting for 3213e088 Child library cache lock level=6 child#=2

        Location from where latch is held: kgllkdl: child: cleanup: latch

        Context saved from call: 1

        state=busy, wlstate=free

          waiters [orapid (seconds since: put on list, posted, alive check)]:

           10 (15, 1336152387, 15)

           31 (3, 1336152387, 3)

           waiter count=2

          gotten 135516 times wait, failed first 6 sleeps 5

          gotten 0 times nowait, failed: 0

      on wait list for 3213e088

    Process Group: DEFAULT, pseudo proc: 33E7F368

    O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 3140

    OSD pid info: Windows thread id: 3140, image: ORACLE.EXE (SHAD)

……省略显示其他内容

 

注意上述trace文件中有如下内容:

waiting for 3213e088 Child library cache lock level=6 child#=2

但注意这里没有holding Child library cache所以我们已经证明了硬解析会持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache lock。

 

第三步:我们来证明硬解析会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cacheChild library cache pin

证明步骤和思路与第二步一模一样,这里不再赘述,我们直接来看相关的process dump的trace文件:

d:\oracle\admin\cuihua\udump\cuihua_ora_2444.trc的内容:

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2444.trc

Sat May 05 01:34:27 2012

ORACLE V10.2.0.1.0 – Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 – type 586

Process Affinity    : 0×00000000

Memory (Avail/Total): Ph:963M/2030M, Ph+PgF:2334M/3921M, VA:1269M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 30

 

Windows thread id: 2444, image: ORACLE.EXE (SHAD)

 

 

*** 2012-05-05 01:34:27.609

*** ACTION NAME:(Command Window – New) 2012-05-05 01:34:27.546

*** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:34:27.546

*** SERVICE NAME:(cuihua) 2012-05-05 01:34:27.546

*** SESSION ID:(130.15) 2012-05-05 01:34:27.546

Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 800, image:

===================================================

PROCESS STATE

————-

Process global information:

     process: 33E51F1C, call: 317DD38C, xact: 00000000, curses: 33F1D17C, usrses: 33F1D17C

  —————————————-

  SO: 33E51F1C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

  (process) Oracle pid=30, calls cur/top: 317DD38C/317DD38C, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0

  (post info) last post received: 0 0 0

              last post received-location: No post

              last process to post me: none

              last post sent: 0 0 0

              last post sent-location: No post

              last process posted by me: none

    (latch info) wait_event=0 bits=0

        Location from where call was made: kglpnal: child: alloc space: latch

        Context saved from call: 1

      waiting for 3213df50 Child library cache pin level=6 child#=2

        Location from where latch is held: kglpndl: child

        Context saved from call: 1

        state=busy, wlstate=free

          waiters [orapid (seconds since: put on list, posted, alive check)]:

           10 (49, 1336152864, 49)

           30 (24, 1336152864, 24)

           waiter count=2

          gotten 348654 times wait, failed first 2 sleeps 1

          gotten 0 times nowait, failed: 0

      on wait list for 3213df50

    Process Group: DEFAULT, pseudo proc: 33E7F368

    O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2444

    OSD pid info: Windows thread id: 2444, image: ORACLE.EXE (SHAD)

……省略显示其他内容

 

注意上述trace文件中有如下内容:

waiting for 3213df50 Child library cache pin level=6 child#=2

但注意这里没有holding Child library cache所以我们已经证明了硬解析会持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache pin。

 

第四步:我们来证明软解析也会去持有Child library cache,所以过度的软解析实际上也会导致Child library cache的争用:

先到session 1、session 2和session 3中去做一下准备工作:

Session 1:

SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache’;

 

‘ORADEBUGPOKE0X’||ADDR||’40X00

————————————-

oradebug poke 0x3213DE80 4 0×00000001

oradebug poke 0x3213DE18 4 0×00000001

oradebug poke 0x3213DDB0 4 0×00000001

 

Session 2:

SQL> show parameter session_cache;

 

NAME                                 TYPE        VALUE

———————————— ———– ——————————

session_cached_cursors               integer     20

这里session_cached_cursors的值大于0,所以这里执行同样的sql三次以上的时候才会是软软解析,如果同样的sql的执行次数大于一次但不超过三次,则是软解析。

 

SQL> select * from scott.emp;

 

EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO

—– ———- ——— —– ———– ——— ——— ——

 7981 CUIHUA                7981                                

 7369 SMITH      CLERK      7902 1980-12-17     800.00               20

 7499 ALLEN      SALESMAN   7698 1981-2-20     1600.00    300.00     30

 7521 WARD       SALESMAN   7698 1981-2-22     1250.00    500.00     30

 7566 JONES      MANAGER    7839 1981-4-2      2975.00               20

 7654 MARTIN     SALESMAN   7698 1981-9-28     1250.00   1400.00     30

 7698 BLAKE      MANAGER    7839 1981-5-1      2850.00               30

 7782 CLARK      MANAGER    7839 1981-6-9      2450.00               10

 7788 SCOTT      ANALYST    7566 1987-4-19     3000.00               20

 7839 KING       PRESIDENT       1981-11-17    5000.00               10

 7844 TURNER     SALESMAN   7698 1981-9-8      1500.00      0.00     30

 7876 ADAMS      CLERK      7788 1987-5-23     1100.00               20

 7800 JAME4      CLERK      7698 1981-12-3      950.00               30

 

13 rows selected

 

注意这里子cursor和执行次数均为1:

SQL> select sql_text, SQL_ID, VERSION_COUNT,executions from v$sqlarea where sql_text like ‘select * from scott.emp%’;

 

SQL_TEXT                                                                    SQL_ID        VERSION_COUNT EXECUTIONS

——————————————————————————– ————- ————- ———-

select * from scott.emp                                                          dzk32w36udhc5             1          1

 

SQL> select PLAN_HASH_VALUE, CHILD_NUMBER from v$sql where sql_id=’dzk32w36udhc5′;

 

PLAN_HASH_VALUE CHILD_NUMBER

————— ————

     3956160932            0

 

SQL> select sid from v$mystat where rownum=1;

 

       SID

———-

       134

 

SQL> select spid from v$process where addr in (select paddr from v$session where sid=134);

 

SPID

————

2740

 

Session 3:

SQL> oradebug setospid 2740

Oracle pid: 30, Windows thread id: 2740, image: ORACLE.EXE (SHAD)

 

接着我们回到session 1,去手工持有所有的Child library cache:

Session 1:

SQL> oradebug setmypid

已处理的语句

 

SQL> oradebug poke 0x3213DE80 4 0×00000001

BEFORE: [3213DE80, 3213DE84) = 00000000

AFTER:  [3213DE80, 3213DE84) = 00000001

SQL> oradebug poke 0x3213DE18 4 0x00000001

BEFORE: [3213DE18, 3213DE1C) = 00000000

AFTER:  [3213DE18, 3213DE1C) = 00000001

SQL> oradebug poke 0x3213DDB0 4 0x00000001

BEFORE: [3213DDB0, 3213DDB4) = 00000000

AFTER:  [3213DDB0, 3213DDB4) = 00000001

 

接着到session 2去执行一下我们刚才已经在session 1中执行过的sql,这里也会hang住:

Session 2:

SQL> select * from scott.emp;

……这里hang住了

 

接着我们到session 3中去对session 2做一下process dump:

Session 3:

SQL> oradebug dump processstate 10

已处理的语句

SQL> oradebug tracefile_name

d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

 

d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc的内容:

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

Thu Jan 28 09:08:16 2010

ORACLE V10.2.0.1.0 - Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 - type 586

Process Affinity    : 0x00000000

Memory (Avail/Total): Ph:1387M/2030M, Ph+PgF:2882M/3921M, VA:1272M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 24

 

Windows thread id: 2740, image: ORACLE.EXE (SHAD)

 

 

*** SERVICE NAME:(cuihua) 2010-01-28 09:08:16.375

*** SESSION ID:(140.4) 2010-01-28 09:08:16.375

Memory Notification: Library Cache Object loaded into SGA

Heap size 2210K exceeds notification threshold (2048K)

LIBRARY OBJECT HANDLE: handle=33872c70 mutex=33872D24(0)

name=XDB.XDbD/PLZ01TcHgNAgAIIegtw==

hash=e0f82b0c545a707da230a62675d34c80 timestamp=08-30-2005 14:28:09

namespace=XDBS flags=KGHP/TIM/SML/[02000000]

kkkk-dddd-llll=0000-0000-0000 lock=S pin=S latch#=2 hpc=0002 hlc=0002

lwt=33872CCC[33872CCC,33872CCC] ltm=33872CD4[33872CD4,33872CD4]

pwt=33872CB0[33872CB0,33872CB0] ptm=33872CB8[33872CB8,33872CB8]

ref=33872CEC[33872CEC,33872CEC] lnd=33872CF8[33872CF8,33872CF8]

  LIBRARY OBJECT: object=303c0ba0

  type=XDBS flags=EXS/LOC[0005] pflags=[0000] status=VALD load=X

  DATA BLOCKS:

  data#     heap  pointer    status pins change whr alloc(K)  size(K)

  —– ——– ——– ——— —- —— — ——– ——–

      0 33872c00 303c0c5c I/P/A/-/-    0 NONE   00      0.34     0.00

      1 303c0c98 3038daa8 I/P/A/-/-    1 NONE   00   2210.98  2224.10

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2740.trc

Sat May 05 08:48:42 2012

ORACLE V10.2.0.1.0 – Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 – type 586

Process Affinity    : 0×00000000

Memory (Avail/Total): Ph:1290M/2030M, Ph+PgF:2663M/3921M, VA:1268M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 30

 

Windows thread id: 2740, image: ORACLE.EXE (SHAD)

 

 

*** 2012-05-05 08:48:42.781

*** ACTION NAME:(Command Window – New) 2012-05-05 08:48:42.687

*** MODULE NAME:(PL/SQL Developer) 2012-05-05 08:48:42.687

*** SERVICE NAME:(cuihua) 2012-05-05 08:48:42.687

*** SESSION ID:(134.7) 2012-05-05 08:48:42.687

Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 400, image:

===================================================

PROCESS STATE

————-

Process global information:

     process: 33E51F1C, call: 317C80A0, xact: 00000000, curses: 33F21C9C, usrses: 33F21C9C

  —————————————-

  SO: 33E51F1C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

  (process) Oracle pid=30, calls cur/top: 317C80A0/317C80A0, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0

  (post info) last post received: 0 0 0

              last post received-location: No post

              last process to post me: none

              last post sent: 0 0 0

              last post sent-location: No post

              last process posted by me: none

    (latch info) wait_event=0 bits=0

        Location from where call was made: kglpin: child: heap processing: latch

        Context saved from call: 1

      waiting for 3213de18 Child library cache level=5 child#=2

        Location from where latch is held: kglpndl: child: after processing: latch

        Context saved from call: 1

        state=busy, wlstate=free

          waiters [orapid (seconds since: put on list, posted, alive check)]:

           30 (39, 1336178921, 39)

           waiter count=1

          gotten 179887 times wait, failed first 15 sleeps 1

          gotten 206 times nowait, failed: 0

      on wait list for 3213de18

    Process Group: DEFAULT, pseudo proc: 33E7F368

    O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2740

OSD pid info: Windows thread id: 2740, image: ORACLE.EXE (SHAD)

……省略显示其他内容

注意上述trace文件中有如下内容:

waiting for 3213de18 Child library cache level=5 child#=2

所以我们已经证明了软解析也会去持有Child library cache(这个其实很正常,因为软解析还是要扫描library cache object handle链表的嘛),即过度的软解析实际上也会导致Child library cache的争用。

 

手工释放上述Child library cache后回到session 2,刚才hang住的sql已恢复正常:

SQL> select * from scott.emp;

 

EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO

—– ———- ——— —– ———– ——— ——— ——

 7981 CUIHUA                7981                                

 7369 SMITH      CLERK      7902 1980-12-17     800.00               20

 7499 ALLEN      SALESMAN   7698 1981-2-20     1600.00    300.00     30

 7521 WARD       SALESMAN   7698 1981-2-22     1250.00    500.00     30

 7566 JONES      MANAGER    7839 1981-4-2      2975.00               20

 7654 MARTIN     SALESMAN   7698 1981-9-28     1250.00   1400.00     30

 7698 BLAKE      MANAGER    7839 1981-5-1      2850.00               30

 7782 CLARK      MANAGER    7839 1981-6-9      2450.00               10

 7788 SCOTT      ANALYST    7566 1987-4-19     3000.00               20

 7839 KING       PRESIDENT       1981-11-17    5000.00               10

 7844 TURNER     SALESMAN   7698 1981-9-8      1500.00      0.00     30

 7876 ADAMS      CLERK      7788 1987-5-23     1100.00               20

 7800 JAME4      CLERK      7698 1981-12-3      950.00               30

 

13 rows selected

 

从如下结果可以看到,子cursor的数量还是1,执行次数变成了2,即刚才第二次执行的sql毫无疑问是软解析:

SQL> select sql_text, SQL_ID, VERSION_COUNT,executions from v$sqlarea where sql_text like ‘select * from scott.emp%’;

 

SQL_TEXT                                                                         SQL_ID        VERSION_COUNT EXECUTIONS

——————————————————————————– ————- ————- ———-

select * from scott.emp                                                          dzk32w36udhc5             1          2

 

第五步:我们来证明软解析还是会去持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache lock,但不会同时持有Child library cacheChild library cache lock

证明步骤和思路与第二步一模一样,这里不再赘述,我们直接来看相关的process dump的trace文件:

d:\oracle\admin\cuihua\udump\cuihua_ora_2988.trc的内容:

Dump file d:\oracle\admin\cuihua\udump\cuihua_ora_2988.trc

Sat May 05 01:50:36 2012

ORACLE V10.2.0.1.0 – Production vsnsta=0

vsnsql=14 vsnxtr=3

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production

With the Partitioning, OLAP and Data Mining options

Windows XP Version V5.1 Service Pack 2

CPU                 : 2 – type 586

Process Affinity    : 0×00000000

Memory (Avail/Total): Ph:1031M/2030M, Ph+PgF:2330M/3921M, VA:1269M/2047M

Instance name: cuihua

 

Redo thread mounted by this instance: 1

 

Oracle process number: 24

 

Windows thread id: 2988, image: ORACLE.EXE (SHAD)

 

 

*** 2012-05-05 01:50:36.531

*** ACTION NAME:(Command Window – New) 2012-05-05 01:50:36.468

*** MODULE NAME:(PL/SQL Developer) 2012-05-05 01:50:36.468

*** SERVICE NAME:(cuihua) 2012-05-05 01:50:36.468

*** SESSION ID:(132.36) 2012-05-05 01:50:36.468

Received ORADEBUG command ‘dump processstate 10′ from process Windows thread id: 972, image:

===================================================

PROCESS STATE

————-

Process global information:

     process: 33E4FB7C, call: 33F61154, xact: 00000000, curses: 33F1F70C, usrses: 33F1F70C

  —————————————-

  SO: 33E4FB7C, type: 2, owner: 00000000, flag: INIT/-/-/0×00

  (process) Oracle pid=24, calls cur/top: 33F61154/33F61154, flag: (0) -

            int error: 0, call error: 0, sess error: 0, txn error 0

  (post info) last post received: 0 0 0

              last post received-location: No post

              last process to post me: none

              last post sent: 0 0 0

              last post sent-location: No post

              last process posted by me: none

    (latch info) wait_event=0 bits=0

        Location from where call was made: kgllkdl: child: cleanup: latch

        Context saved from call: 2

      waiting for 3213e0f0 Child library cache lock level=6 child#=1

        Location from where latch is held: kgllkdl: child: cleanup: latch

        Context saved from call: 2

        state=busy, wlstate=free

          waiters [orapid (seconds since: put on list, posted, alive check)]:

           25 (123, 1336153833, 123)

           27 (123, 1336153833, 123)

           11 (120, 1336153833, 120)

           20 (114, 1336153833, 114)

           19 (111, 1336153833, 111)

           24 (39, 1336153833, 39)

           waiter count=6

          gotten 261770 times wait, failed first 74 sleeps 22

          gotten 0 times nowait, failed: 0

      on wait list for 3213e0f0

    Process Group: DEFAULT, pseudo proc: 33E7F368

    O/S info: user: SYSTEM, term: Y8908CUIHUA, ospid: 2988

OSD pid info: Windows thread id: 2988, image: ORACLE.EXE (SHAD)

……省略显示其他内容

注意上述trace文件中有如下内容:

waiting for 3213e0f0 Child library cache lock level=6 child#=1

但注意这里没有holding Child library cache所以我们已经证明了软解析依然会持有Child library cache lock,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache lock,但不会同时持有Child library cache和Child library cache lock。

 

第六步:我们来证明软解析还是会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说软解析会去持有Child library cache pin,但不会同时持有Child library cacheChild library cache pin

证明步骤和思路与第五步一模一样,这里不再赘述。

 

结合我文章开头提到的那三篇文章(尤其是大熊的那篇文章)和上面的六步证明过程,我们得出如下关于硬解析和软解析时latch持有情况的结论(这个结论是近似的,并没有包括所有的情况):

1SQL在硬解析的时候,会先去持有Child library cache以扫描library cache object handle链表,接着会释放Child library cache(因为找不到相关的library cache object handle),接下来会再次持有Child library cache(因为要修改library cache object handle链表),然后在不释放Child library cache的情况下进而去持有Child shared pool以分配内存,接着释放Child shared pool(因为已经分配完了),接着再释放Child library cache然后因为要以null模式获得library cache lock,所以要先持有Child library cache lock;接着要把相关sql的执行计划load到对应的子cursor的library cache object handle中,即要以exclusive模式获得library cache pin,所以要先持有Child library cache pin;

2SQL在软解析的时候和硬解析类似,只不过不需要持有Child shared pool并且持有Child library cache的时间会更短。

 

最后我们来说一下硬解析和软解析在latch争用情况下的本质区别:

1、软解析不会导致Child shared pool的争用;

2、硬解析跟软解析比会导致更为剧烈的Child library cache的争用,这个当然是因为硬解析在不释放Child library cache的情况下进而会去持有Child shared pool

 


8 Comments on “硬解析和软解析情况下latch持有情况分析”

    1. 测试的真仔细!

    2. overmars

      关于硬分析持有的latch,我看大部分的书都是这样描述:
      硬分析应该是先持有SQL handle的library cache lock ,然后在生成计划的过程中,对SQLhandle的LCO以exclusive 方式持有library cache pin.

      但是楼主的说法,好像恰好反过来了,请楼主能否进一步解释下2者持有的顺序,lock和pin的对象分别是针对的对象是谁?

    3. overmars

      ok,等待您的大作!

      • Cui Hua

        这个你是对的,应该是先持有library cache lock latch,再持有library cache pin latch,我已经改过来了。

    4. wzh

      测试非常详细,看完后加深了对软解析、硬解析的理解。

    5. taowang2016

      你好,goog job!
      我在重现以下过程:
      第三步:我们来证明硬解析会去持有Child library cache pin,但是是在释放了Child library cache的情况下,也就是说硬解析的时候不会同时持有Child library cache和Child library cache pin
      时出现了一个问题,向你请教。

      环境:windows 32
      oracle:10.2.0.1

      准备信息如下:
      session1:
      SQL> select ‘oradebug poke 0x’ || addr || ‘ 4 0×00000001′ from v$latch_children where name=’library cache pin’;

      ‘ORADEBUGPOKE0X’||ADDR||’40X00000001′
      ————————————————
      oradebug poke 0x240BDF50 4 0×00000001
      oradebug poke 0x240BDEE8 4 0×00000001
      oradebug poke 0x240BDE80 4 0×00000001

      SQL> alter system flush shared_pool;

      系统已更改。

      SQL> oradebug setmypid
      已处理的语句

      session2的准备信息如下:
      SQL> select sid from v$mystat where rownum=1;

      SID
      ———-
      148

      SQL> select spid from v$process where addr in (select paddr from v$session where sid=148);

      SPID
      ————————
      4072

      session3准备信息如下:
      SQL> oradebug setospid 4072

      Oracle pid: 19, Windows thread id: 4072, image: ORACLE.EXE (SHAD)

      接着
      在session1中poke了library cache pin

      SQL> oradebug poke 0x240BDF50 4 0×00000001
      BEFORE: [240BDF50, 240BDF54) = 00000000
      AFTER: [240BDF50, 240BDF54) = 00000001
      SQL> oradebug poke 0x240BDEE8 4 0×00000001
      BEFORE: [240BDEE8, 240BDEEC) = 00000000
      AFTER: [240BDEE8, 240BDEEC) = 00000001
      SQL> oradebug poke 0x240BDE80 4 0×00000001
      BEFORE: [240BDE80, 240BDE84) = 00000000
      AFTER: [240BDE80, 240BDE84) = 00000001

      session2中执行以下查询,被hang住:

      SQL> select * from scott.emp;
      select * from scott.emp
      hang住

      session3中dump process信息,此时出现了如下错误:

      SQL> oradebug dump processstate 10
      ORA-00072: 进程 “Windows thread id: 4072, image: ORACLE.EXE (SHAD)” 不活动

      同时,发现session2中有如下提示错误:

      select * from scott.emp
      *
      第 1 行出现错误:
      ORA-03113: 通信通道的文件结束

      请问这是如何导致的,经过测试library cache lock的对的,但这个library cache pin的测试做了很多次,一直都是上面的错误信息,是我测试的哪一步出了问题呢?

      • taowang2016

        还是同样的测试,刚特意在centos 5.7 64bit, oracle 10.2.0.1 64bit的环境上进行,就没有这个问题,也就是说在linux上的测试结果和你的是一样的,可是在windows上,试了很多次,一直提示如上的错误,问题可能出在什么地方呢?

转载于:https://www.cnblogs.com/taowang2016/archive/2013/02/25/2932069.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值