[20190409]latch get 参数where and why测试.txt

[20190409]latch get 参数where and why测试.txt


--//链接http://blog.itpub.net/267265/viewspace-2640890/大概了解哦latch get参数.做一个测试:


1.环境:

SYS@book> @ ver1

PORT_STRING                    VERSION        BANNER

------------------------------ -------------- --------------------------------------------------------------------------------

x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production


SYS@book> select addr, name from v$latch_parent where lower(name) like '%'||lower('test excl. parent2 l0')||'%';

ADDR             NAME

---------------- ----------------------------------------

0000000060009978 test excl. parent2 l0


--//拿这个拴锁作为测试.

$ cat lwhere.sql

column "where" format a36

column location format a36

column PARENT_NAME format a30

SELECT t1.inst_id inst_id

      ,t1.indx indx

      ,t1.ksllasnam PARENT_NAME

      ,t2.ksllwnam "WHERE"

      ,t1.kslnowtf NWFAIL_COUNT

      ,t1.kslsleep SLEEP_COUNT

      ,t1.kslwscwsl WTR_SLP_COUNT

      ,t1.kslwsclthg LONGHOLD_COUNT

      ,t2.ksllwnam LOCATION

  FROM x$ksllw t2, x$kslwsc t1

 WHERE t2.indx = t1.indx AND lower(t1.ksllasnam) LIKE lower('%&&1%');


SYS@book> @ lwhere 'test excl. parent2 l0'

no rows selected

--//没有查询到,顺便选一个数值作为测试,我测试好像这些参数oracle并不做检查.


2.测试:

--//session 1:

SYS@book> @ spidx


       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50

---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------

       295          5 65437                    DEDICATED 65438       21          3 alter system kill session '295,5' immediate;


PRO

-----------------------------------

SYS@book(295.5 spid=65438 pid=21)>


SYS@book(295.5 spid=65438 pid=21)>  oradebug setmypid

Statement processed.


SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000000


SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslgetl 0x0000000060009978 1 2 3

Function returned 1


SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000015

--//持有后写入该会话的PID号,奇怪拴锁写入pid,而mutext写入sid,不知道为什么.


--//session 2:

SYS@book> @ spidx


       SID    SERIAL# PROCESS                  SERVER    SPID       PID  P_SERIAL# C50

---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------

        44         71 65440                    DEDICATED 65441       27         26 alter system kill session '44,71' immediate;


PRO

-----------------------------------

SYS@book(44.71 spid=65441 pid=27)>


$ cat ll.sql

select ksuprpid pid,ksuprsid sid ,ksuprlat laddr,ksuprlnm name, ksuprlmd,ksulawhy,ksulawhr, ksulagts gets from x$ksuprlat;

--//https://andreynikolaev.wordpress.com链接有一些介绍,要获得where,why信息,要查询X$视图才行.

--//这个视图是x$ksupr的v$process的基表.x$ksuprlat仅仅包含latch的信息.


SYS@book(44.71 spid=65441 pid=27)>  @ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     21        295 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          3         13


SYS@book(44.71 spid=65441 pid=27)>  oradebug setmypid

Statement processed.


SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 4

--//挂起!!


--//session 3:

SYS@book> @ wait

P1RAW            P2RAW            P3RAW         P1         P2         P3        SID    SERIAL#       SEQ# EVENT      STATUS   STATE               WAIT_TIME_MICRO SECONDS_IN_WAIT WAIT_CLASS

---------------- ---------------- ----- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- ------------------- --------------- --------------- -----------

0000000060009978 0000000000000005 00    1610652024          5          0         44         71         29 latch free INACTIVE WAITING                    88607826              89 Other


--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 4

Function returned 1

--//现在成功返回.


SYS@book(44.71 spid=65441 pid=27)>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          4         14


--//现在是session 2 持有该拴锁.

SYS@book(44.71 spid=65441 pid=27)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 0000001B


--//0x1B=27,与当前会话的pid=27一直.

--//链接:http://andreynikolaev.wordpress.com/2010/08/24/exclusive-latches-in-memory-and-traces-oracle-versions-7-11-2/

Oracle 11g used PID value from v$process to indicate that exclusive latch is busy. Notice that Oracle latch is hold by

process, not session.  On the contrary mutexes are hold by sessions.


--//Oracle 11g使用v$process中的PID值来指示独占锁存器正处于繁忙状态。注意,Oracle闩锁是由进程而不是会话保存的。相反,互斥

--//是通过会话进行的。


Oracle not always used PID for this purpose in the past. In fact busy exclusive latch had the value of:


--//Oracle过去并不总是为此目的使用PID。事实上,忙碌的独占锁存器的值是:


1 in version 7

0xFF in versions 8.0-10.2

PID in version 11g


3.继续测试:

--//使用strace 跟踪ipcs也就是信号看看.

 $ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit


$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit


--//当前是session 2持有该拴锁.

SYS@book(44.71 spid=65441 pid=27)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 0000001B


--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslgetl 0x0000000060009978 1 2 5

--//挂起!!


--//跟踪进程看到阻塞后执行semop睡眠等待.

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//跟踪进程看到释放拴锁时发出信息.session 2进程:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0


--//session 1进程:

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1)     = 0


--//执行成功!!也就是oracle是通过semctl控制信号操作.


4.继续测试:

--//也就是如果使用oradebug poke修改信息应该是无效的.

--//当前是session 1持有该拴锁.

SYS@book(295.5 spid=65438 pid=21)>  oradebug peek 0x0000000060009978 4

[060009978, 06000997C) = 00000015


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 6


--//挂起!!


--//session 2:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0

semop(310214656, 0x7fffd5533a10, 1


--//session 1,人为取消看看,使用oradebug poke:

SYS@book(295.5 spid=65438 pid=21)>  oradebug poke 0x0000000060009978 4 0x00000000

BEFORE: [060009978, 06000997C) = 00000015

AFTER:  [060009978, 06000997C) = 00000000


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslgetl 0x0000000060009978 1 2 6

Function returned 1


--//session 2:

$ strace -f -e ipc -p 65441

Process 65441 attached - interrupt to quit

semctl(310214656, 25, SETVAL, 0x1)      = 0

semop(310214656, 0x7fffd5533a10, 1)     = 0


--//session 1:

$ strace -f -e ipc -p 65438

Process 65438 attached - interrupt to quit

semop(310214656, 0x7fff12c040b0, 1)     = 0


--//session 1没有执行semctl.不过这样操作后:


SYS@book>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD  KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- --------- -------- ---------- ----------

     21        295 0000000060009978 test excl. parent2 l0                    EXCLUSIVE        2          6         16

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE        2          6         16


--//2个会话都持有该拴锁.这样是有问题的.

--//session 1:

SYS@book(295.5 spid=65438 pid=21)>  oradebug call kslfre 0x0000000060009978

Function returned 0


--//session 2:

SYS@book(44.71 spid=65441 pid=27)>  @ ll

    PID        SID LADDR            NAME                                     KSUPRLMD   KSULAWHY   KSULAWHR       GETS

------- ---------- ---------------- ---------------------------------------- ---------- -------- ---------- ----------

     27         44 0000000060009978 test excl. parent2 l0                    EXCLUSIVE         2          6         16


SYS@book(44.71 spid=65441 pid=27)>  oradebug call kslfre 0x0000000060009978

ORA-03113: end-of-file on communication channel

ORA-24323: value not allowed

--//会话已经中断了.


总结:

--//看了链接https://andreynikolaev.wordpress.com许多文章,才知道10g,11g关于latch发生很大变化,不能再按照以前spin and sleep

--//方式操作.摘抄一段https://fritshoogland.wordpress.com/2015/07/17/oracle-12-and-latches/的内容:


This is how the gdb output looks like when the latch get in willing to wait mode is executed:


kslgetl laddr:60023a80, willing:1, where:0, why:2442

kslges 60023a80, 0, 1, 0

kslwlmod 13311368, -1780327896, 1610758784, 1

skgpwwait 13311608, -1767360, -1780326976, 0

sskgpwwait 13311608, -1767360, -1780326976, 0

semop 360451, 13310840, 1, -1


Interestingly, if the latch is not taken, this is how the latch get sequence looks like:


kslgetl laddr:60023a80, willing:1, where:0, why:2442


In other words, for getting a non shared latch in willing to wait mode:


1-the function kslgetl is called, which tries to fetch the latch.

If the latch can be taken, the function returns, if not:

2-the function kslges (kernel service latch get spinning) is called, which supposedly also tries to take the same latch.

If the latch still can not be taken, the next function is:

3-the function kslwlmod (kernel service latch waiting list modify) is entered.

In this function the process registers itself as waiting in the post/wait list.

4-the function skgpwwait (system kernel generic post/wait wait) is entered.

This function sets up the waiting for the process so it can be posted.

5-the function sskgpwwait (system system kernel generic post/wait wait)

My current understanding is the 'ss' function contain the platform specific code for database functions.

6-the (operating system) function semop (semaphore operation) is called.


This will make the process sleep waiting on a semaphore (operating system signalling mechanism). This way, the process

will not be runnable on the CPU unless the semaphore is posed.


--//这将使进程休眠等待信号量(操作系统信令机制)。这样,该方法除非发出信号量,否则将无法在CPU上运行。

--//顺便问一下https://fritshoogland.wordpress.com/2015/07/17/oracle-12-and-latches/里面提到的地址如何获得的.

Some searching around revealed that a CPU register reveals this information. Add this to the above gdb script:


break *0xc29b51

  commands

    silent

    printf " kslges loop: %d\n", $ecx

    c

  end


--//有时间再看看.


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2640940/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/267265/viewspace-2640940/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值