oracle锁等待现象,高速缓冲内enq锁 - Oracle等待事件详细分析_数据库技术_Linux公社-Linux系统门户网站...

最新推荐文章于 2024-06-18 17:19:10 发布

weixin_39683734

最新推荐文章于 2024-06-18 17:19:10 发布

阅读量413

点赞数

文章标签： oracle锁等待现象

6、 enq：TC-contention

在手动执行检查点操作中，一部分需要获得TC锁(thread checkpointlock 或 tablespace checkpointlock )在获得TC锁过程中，若发生争用，则需要等待enq:TC-contention 事件。事实上获得TC锁的过程稍微复杂。

1) 服务器进程首先以X模式获得TC锁

2) 服务器进程将已获得的TC锁变更为SSX模式。同时，CKPT进程以SS模式获得该锁。CKPT获得锁后执行检查点操作。

3) 服务器欲重新以X模式获得TC锁，等待CKPT释放该锁，这时的等待事件就是enqueue：TC-contention

4) 检查点工作结束后，CKPT进程将会释放TC锁，服务器进程就会获得TC锁，因此得知检查点工作已经结束。

Enq：TC-contention 等待即便在没有多个进程引起争用的情况下，也可以发生，在这一点上与其他锁争用引起的等待现象不同。需要理解的是在等待现象中，存在只有争用才能引发的等待现象，但是也存在不发生争用，也会单纯为了等待工作结束而等待的情况。。

发生检查点的情况虽然很多，但不是所有的情况都会发生TC锁引起的等待，之后再进程由服务器进程引发的检查点同步过程中发生。

enq:TC-contention 等待发生的代表案例如下：并行查询和表空间热备

1:、并行查询(parallel query)

pq发生检查点的原因是slave session引起的direct path read。这就是所谓的“直接路径读”，它不经过高速缓冲区直接读取数据文件。Oracle在如下三种情况下使用direct path read(也叫physical read direct) 方式的读取。

(1)内存区域上不能完成排序工作时，会在临时段的区域里存储和读取的过程中，发生direct path write ，direct path read 。这时的等待事件可以通过direct path read temp、direct path write temp 观察。

(2)slave session(从属会话)为了扫描直接读取的数据文件时，使用direct path read 。这时等待事件通过direct path read 事件观察。

(3)若判断是因为I/O系统的性能下降，导致不能将以足够快的速度读取，oracle为了临时方便会使用direct path read。

slave session执行direct path read 对象时数据文件，从数据文件上直接读取数据时，因为不经过SGA，所以可能发生当期SGA上的块和数据文件上的块之间版本不一致的现象，为了防止这些现象，oracle对数据文件执行direct path read 之前，应该执行检查点。coordinate session在驱动slave session之前，对于执行direct path read现象，请求段级别的检查点，检查点发生之前一直处于enq：TC-contention等待事件状态。coordinate session上可以发现enq：TC-eontention等待，slave session上则可以发现direct path read 等待。

2、表空间热备份(tablespace hot backup)

执行alter tablespace 。。begin backup后，将属于此表空间的所有高速缓冲区的脏数据记录到磁盘上，这个过程经历enq:TC-contention 等待。

7、enq：CI-contention 和 enq：RO-contention

“Cross Instance call Enqueue”是一种在一个或多个instance实例间调用后台进程行为时用到的队列锁,具体调用的后台进程行为包括检查点checkpoint、日志切换logfile switch、shutdown实例、载入数据文件头等等。需要注意的是这种Enqueue Lock并不仅仅在RAC中使用，即便是单节点也会用到。CI锁的数量取决于并行执行Cross Instance Call调用的进程的总数。SQL> col ksqsttyp for a20

SQL> col ksqstrsn for a20

SQL> col ksqstexpl for a80

SQL> set linesize 200 pagesize 2000;

SQL>select ksqsttyp,ksqstrsn,ksqstexpl from x$ksqst where ksqsttyp='CI';KSQSTTYP KSQSTRSN KSQSTEXPL

-------------------- -------------------- --------------------------------------------------------------------------------

CI contention Coordinates cross-instance function invocations

SQL>show parameter cluster_databaseNAME TYPE VALUE

------------------------------------ ----------- ------------------------------

cluster_database boolean FALSE

cluster_database_instances integer 1

SQL>select * from v$enqueue_stat where eq_type='CI';INST_ID EQ TOTAL_REQ# TOTAL_WAIT# SUCC_REQ# FAILED_REQ# CUM_WAIT_TIME

---------- -- ---------- ----------- ---------- ----------- -------------

1 CI 595 0 595 0 0

当系统中出现有大量这种跨实例后台进程调用时，将出现CI队列锁的争用。

假设在一个RAC场景中，同时有大量的回话开始对不同的数据表执行TRUNCATE截断操作，TRUNCATE的一个前提是在所有实例上(因为对象表的dirty buffer可能分布在多个实例上)发生对象级别的检查点(object level checkpoint)，检查点发生时CKPT进程会通知DBWR写出指定对象表相关的脏块，DBWR需要扫描Buffer Cache以找出脏块，而如果Buffer Cache很大那么扫描将花费大量的时间，而在此过程中前台进程将一直排他地持有着本地的CI队列锁，这就将造成CI锁的严重争用。

为了减少CI队列锁地争用，我们第一步所要做的是找出实际的Cross Instance call跨实例调用的类型。这里要另外提一下的是在10g以前不管是v$session_wait或statspack中都不会将enqueue锁等待事件的具体enqueue lock类型写明，一般需要我们从p1/p2/p3列中找出enqueue的具体身份，例如”WAIT #1: nam=’enqueue’ ela= 910796 p1=1128857606 p2=1 p3=4″，这里的p1为1128857606也就是16进制的43490006，高位的’4349′转换为ascii码也就是’CI’，而这里的p2/p3对应为V$lock中的ID1/ID2，ID1=1代表了”Reuse (checkpoint and invalidate) block range”，ID2=4代表了”Mounted excl, use to allocate mechanism”。

具体ID1/ID2代表的含义在不同版本中有所变化，可以参考下表:Id1, Id2 Combination:

Oracle 10gR1

27 TO 29 *Same as 9i R2

30 process waiters after row cache requeue

31 Active Change Directory extent relocation

32 block change tracking state change

33 kgl mulitversion obsolete

34 set previous resetlogs data

35 set recovery destination pointer

36 fast object reuse request

37 test ksbcic()

38 ASM diskgroup discovery wait

39 ASM diskgroup release

40 ASM push DB updates

41 ASM add ACD chunk

42 ASM map resize message

43 ASM map lock message

44 ASM map unlock message (phase 1)

45 ASM map unlock message (phase 2)

46 ASM generate add disk redo marker

47 ASM check of PST validity

48 ASM offline disk CIC

49 Logical Standby Sync Point SCN

50 update SQL Tuning Base existence bitvector

51 PQ induced Checkpointing

52 ASM F1X0 relocation

53 Scheduler autostart

54 KZS increment grant/revoke counter

55 ASM disk operation message

56 ASM I/O error emulation

57 DB Supp log cursor invalidation

58 Cache global range invalidation

59 Cache global object invalidation

60 ASM Pre-Existing Extent Lock wait

61 Perform a ksk action through DBWR

62 ASM diskgroup refresh wait

Oracle 10gR2

30 to 62 *Same as 10gR1

63 KCBO object checkpoint

64 KCBO object pq checkpoint

65 global health check event

66 Oracle Label Security refresh

67 thread internal enable

68 cross-instance registration

69 KGL purge unused subheaps

70 clear pin instance flag

71 Rolling operations CIC

Oracle 9iR2

Id1 Meaning

~~~ ~~~~~~

25 TO 26 *Same as 9i R1

27 set Database Force Logging mode

28 invalidate cached file address translations

29 Cursor Unauthorize Mode

30 snapshot too old diagnosis

31 process waiters after row cache requeue

Oracle 9iR1

Id1 Meaning

~~~ ~~~~~~~~

18 TO 24 *Same as Oracle 8i

25 Update Dscn Tracking (ktcndt)

26 Purge dictionary Object number Cache

Oracle 8i

Id1 Meaning

~~~ ~~~~~~~

0 TO 17 *Same as Oracle 8(please see "Oracle8*" for the Meaning)

18 Object reuse request

19 Rolling release checks

20 Propagate begin backup scn for a file

21 Refresh top plan (for db scheduler)

22 Clear checkpoint progress record

23 Drop temp file

24 Quiesce database Restricted

Id2 Meaning

~~~ ~~~~~~~

0x01 Used to pass in parameters

0x02 Used to invoke the function in backgroud process

0x03 Used to indicate the foreground has not returned

0x04 Mounted excl, use to allocate mechanism

0x05 Used to queue up interested clients

Oracle 8*

Id1 Meaning

~~~~~~~~~~

0 Checkpoint block range

1 Reuse (checkpoint and invalidate) block range

2 LGWR Checkpointing and Hot Backup

3 DBWR syncronization of SGA with control file

4 Log file add/drop/rename notification

5 Write buffer for CR read

6 Test call

7 Invalidate KCK cache in all instances

8 Alter rollback segment optimal

9 Signal Query Servers/coordinator

10 Create Remote parallel query Server

11 Set Global Partitions

12 Stop Disk Writes

13 Drop Sort Segments

14 Release unused space from Sort Segments

15 Instance Recovery for Parallel operation Group

16 Validate parallel slave Lock Value

17 Check transaction state objects

18 Flush blocks in object

19 Rolling release checks

20 Propagate begin backup scn for a file

21 Clear checkpoint progress record

Oracle 7

Id1 Meaning

~~~~~~~~~~

0 Flush buffers for reuse as new class

1 LGWR checkpointing and Hot Backup

2 DBWR synchronization of SGA with control file

3 Log file add/drop/rename notification

4 Write buffer for CR read

5 Test Call

6 Invalidate KCK cache in all instances

7 Alter rollback segment optimal

8 Signal Query Servers/coordinator

9 Create Remote Parallel Query Server

10 Set Global Partitions

11 Stop Disk Writes

12 Drop Sort Segments

13 Release unused space from Sort Segments

14 Instance Recovery for Parallel operation Group

15 Validate parallel slave Lock Value

16 Check Transaction State Objects

Id2 Meaning

~~~~~~~~~~

1 Pass in Parameters

2 Invoke the call in background process

3 Foreground has not returned yet

4 Used to allocate the CI call

5 Used to queue up interested clients

enq: RO - fast object reuse 等待事件

查了一下这个等待，出现这个等待比较高的情况一般都有异常：

1.truncate表或者分区表时

2.收集统计信息采用degree>1时

这个event表示在等待DBWR to clean cache.

出现异常的时候症状：the CKPT background process is the one holding the needed RO enqueue although it is actually doing nothing.

Bug:7385253.

这个wait event表示在等待DBWR to clean cache.

如果要优化这个问题,需要综合考虑,比如减少cache size,增加dbwr process或减少MTTR等等。

Is it still locked?

For some reason the truncate is still waiting for CKPT process. When you truncate or drop a table, CKPT does a range flush of the db_cache_size, which seems to be completed according to your alert_log.

That was an issue in 9i-10g. This looks like bug 4201369 which is supposed to be fixed in 10.1.0.5. I will suggest you open a tar on this! They will make you do a hanganalysis which should clarify the issue.

该等待事件多与bug 相关

2.1 Bug 1：Bug 7385253

Bug 7385253 - Slow Truncate / DBWR useshigh CPU / CKPT blocks on RO enqueue [ID 7385253.8]

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions >= 10 but BELOW 11.2

Versions confirmed as being affected

Platforms affected

Generic (all / most platforms affected)

Fixed:

该Bug的3个表现：

DBWR may use alot of CPU and seem to spin in or around kcbo_write_qdue to large number offree buffers on the object reuse queue or checkpoint queue.

In some casesthe CKPT holds the RO enqueue for very long blocking other operations with waitevent "enq: RO - fast objectreuse".

Operations so farreported being affected are :

- Apply Processes in StandBy databases

- Gather stats

- Truncates

- drop/shrink/alter tablespace

Note: This fix was previously incorrectlylisted as not affecting 11g.

The bug itself is present in 11g but it is unlikely to show anysignificant symptom due to other 11g changes meaning that free buffers are nolonger kept on the object queue.

对与该Bug的解决方法：

setting _db_fast_obj_truncate=FALSE

enabling asyn i/o

applying 7287289

2.2文档二

'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallel [ID762085.1]

Symptoms：

(1)Database has been recently upgradedfrom 10.2.0.1 to 10.2.0.4.

(2)There is 'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallelusing DBMS_STATS package (with DEGREE>1).

其也是因为Bug 7385253导致这个问题。

解决方法：

1) Flushing the buffer cache.

2) Setting "_db_fast_obj_truncate" =FALSE. This reverts back to the9i way of invalidating buffers in the buffer cache.

Kindly note thatboth workarounds could have an impact on the database performance. Instead, itis recommended applying the corresponding patch.

--这2种解决方法对db性能都有很大影响，建议应用合适的patch。

2.3文档三

Bug8544896 - Waits for "enq: RO - fast object reuse" with high DBWR CPU[ID 8544896.8]

Affects:

Product (Component)

Oracle Server (Rdbms)

Range of versions believed to be affected

Versions >= 10.2.0.4 but BELOW 10.2.0.5

Versions confirmed as being affected

Platforms affected

Generic (all / most platforms affected)

It is believed to be a regression in default behaviour thus:

Regression introduced in 10.2.0.4

Fixed:

This problem is introduced in 10.2.0.4.

Sessions can wait on "enq: RO - fastobject reuse" while DBWR consumes lots of CPU when performing truncatetype operations.

Workaround:

(1)Flush the buffer cache beforetruncating

(2) set _db_fast_obj_truncate = FALSE.

我这里出现这2个等待事件都与Truncate操作有关。

weixin_39683734

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
oracle锁等待现象,高速缓冲内enq锁 - Oracle等待事件详细分析_数据库技术_Linux公社-Linux系统门户网站...

6、 enq：TC-contention在手动执行检查点操作中，一部分需要获得TC锁(thread checkpointlock 或 tablespace checkpointlock )在获得TC锁过程中，若发生争用，则需要等待enq:TC-contention 事件。事实上获得TC锁的过程稍微复杂。1)服务器进程首先以X模式获得TC锁2)服务器进程将已获得的TC锁变更为SSX模式。同时，C...
复制链接

扫一扫