Buffer Cache 不同导致执行时间不同UNNEST HASH_SJ gather_plan_statistics

Gen 2 Exadata Cloud at Customer - Version N/A and later
Information in this document applies to any platform.

SYMPTOMS

On : NA version, Database Performance Issues

ACTUAL BEHAVIOR
---------------

When a same  SQL statement is executed in each instance of RAC, the elapsed times are different.
The elapsed time  if the 2nd instance is 4 times greater than it of the 1st instance.

+Offending SQL statement

SELECT  /*+ gather_plan_statistics FULL(T) PARALLEL(T 4) MONITOR */ *
FROM TB_OPT_MVMT_MST T
WHERE 1=1
AND
NOT EXISTS
( SELECT /*+ UNNEST HASH_SJ */ 1 FROM TB_MVMT@LK_NOIS B WHERE 1 = 1 AND B.MVMT_ID = T.MVMT_ID )

+ Result of the execution of the offending query


On 1st Instance :   Elapsed Time : 28 rows returned,   7 mins 23 secs
On 2nd Instance:   Elapsed Time : 28 rows returned, 31 mins 28 secs



 

CAUSE

1) Comparing Execution Plans and Cell Offloading Statistics from both instances

As you see below, the execution plans of both case are the same but Cell Offloading Statistics are different each other.

Cell offloading operation  has been executed only in the 1st instance.

## Execution on the 2nd instance (Bad Case)

Global Stats
===========================================================================================
| Elapsed |   Cpu     |    IO       | Concurrency  | Cluster   | Fetch | Buffer  | Read | Read      | Write   | Write   |    Offload            |
| Time(s) | Time(s) | Waits(s) |  Waits(s)        | Waits(s) | Calls  |  Gets   | Reqs | Bytes    | Reqs    | Bytes  | Returned Bytes |
===========================================================================================
|    1918 |        787 |       996 |               0.00 |        135 |         3 |    38M | 332K | 323GB | 32475 |  31GB |                 63GB |
===========================================================================================


SQL Plan Monitoring Details (Plan Hash Value=1180455967)
=================================================
| Id |           Operation                                   |      Name                       |
|    |                                                                |                                       |
=================================================
|  0 | SELECT STATEMENT                         |                                       |
|  1 |   HASH JOIN RIGHT ANTI                |                                       |
|     |                                                               |                                       |
|     |                                                               |                                       |
|  2 |     REMOTE                                          | TB_MVMT                     |
|     |                                                               |                                       |
|  3 |     PARTITION RANGE ALL               |                                       |
|  4 |      TABLE ACCESS STORAGE FULL | TB_OPT_MVMT_MST |
|     |                                                               |                                       |
|     |                                                               |                                       | 
=================================================
 

=========================================================
== Cell Offloading Statistics After executing querying
=========================================================
physical read total bytes                                                       346,393,452,544
cell physical IO bytes eligible for predicate offload                                         0
cell physical IO bytes saved by storage index                                                0
cell num smart IO sessions using passthru mode due to cellsrv                   0
cell physical IO interconnect bytes returned by smart scan                          0
cell num bytes in passthru during predicate offload                                         0
table fetch continued row                                                                                 17
cell blocks processed by cache layer                                                               0
cell blocks processed by txn layer                                                                    0
cell blocks processed by data layer                                                                  0
cell blocks processed by index layer                                                                 0
cell IO uncompressed bytes                                                                              0
cell scans                                                                                                            0
=========================================================

## Execution on the 1st instance (Good Case)

Global Stats
==============================================================================================================================
| Elapsed |   Cpu     |    IO        | Application |  Cluster  |   Other   | Fetch | Buffer |  Read | Read    | Write     | Write   | Uncompressed |    Offload   |    Offload           |  Cell        |
| Time(s) | Time(s) | Waits(s ) |  Waits(s)     | Waits(s) | Waits(s) | Calls |  Gets  |  Reqs | Bytes   | Reqs     | Bytes  |             Bytes    | Elig Bytes | Returned Bytes | Offload    |
==============================================================================================================================
|     443   |       216 |         222 |            0.20 |       0.02 |       4.73 |        3 |   38M | 359K | 324GB | 34220 |  33GB |              288GB |      291GB |                 97GB |  66.55% |
==============================================================================================================================


SQL Plan Monitoring Details (Plan Hash Value=1180455967)
=================================================
| Id |           Operation                                   |      Name                       |
|    |                                                                |                                       |
=================================================
|  0 | SELECT STATEMENT                         |                                       |
|  1 |   HASH JOIN RIGHT ANTI                |                                       |
|     |                                                               |                                       |
|     |                                                               |                                       |
|  2 |     REMOTE                                          | TB_MVMT                     |
|     |                                                               |                                       |
|  3 |     PARTITION RANGE ALL               |                                       |
|  4 |      TABLE ACCESS STORAGE FULL | TB_OPT_MVMT_MST |
|     |                                                               |                                       |
|     |                                                               |                                       | 
=================================================

===========================================================
== Cell Offloading Statistics After executing querying
===========================================================
physical read total bytes                                                                 348,210,774,016
cell physical IO bytes eligible for predicate offload                      312,605,237,248
cell physical IO bytes saved by storage index                                                         0
cell num smart IO sessions using passthru mode due to cellsrv                            0
cell physical IO interconnect bytes returned by smart scan         33,353,118,952
cell num bytes in passthru during predicate offload                                                  0
table fetch continued row                                                                                            0
cell blocks processed by cache layer                                                      48,448,889
cell blocks processed by txn layer                                                          48,448,889
cell blocks processed by data layer                                                        37,791,290
cell blocks processed by index layer                                                                         0
cell IO uncompressed bytes                                                          309,595,309,280
cell scans                                                                                                                385
===========================================================

2) Analysis

The root cause of this issue is that  both instances have the different size of  db buffer cache

When the decision on the execution of   DIRECT READ is made on the basis of  the value of VLOT which is calculated based on buffer cache size.

The calculated size of VLOT in the 2nd instance is greater than Object Size so Oracle chooses Cache Read rather than Direct  Read


 
a. NSMTIO event tracing in the 1st Instance

 VLOT = 500% of Buffer Cache  (vlot 500)
Buffer cache size  is  5396334   so the value  of  VLOT is  26981670. The value of VLOT is smaller  than the value of Object Size(27629415) so Direct Read  was chosen.
Hence, Smart Scan works fine


NSMTIO: kcbism: islarge 1 next 0 nblks 27629415 type 2, bpid 3, kcbisdbfc 0 kcbnhl 131072 kcbstt 107926 keep_nb 0 nbuf 5396334 kcbnwp 2 pstt 0 Exadata 1 bsz 8192 objd 257239
NSMTIO: kcbimd: nblks 27629415 kcbstt 107926 kcbnbh 539633 kcbisdbfc 3 is_medium 0 objd 0
NSMTIO: kcbivlo: nblks 27629415 vlot 500 pnb 5396334 kcbisdbfc 0 is_large 1
NSMTIO: qertbFetch:DirectRead:[OBJECT_SIZE>VLOT]
NSMTIO: Additional Info: VLOT=26981670
Object# = 257239, Object_Size = 27629415 blocks
SqlId = b2kqw3372uqu6, plan_hash_value = 1180455967, Partition# = 0

 


b.  NSMTIO event tracing in the 2nd Instance

VLOT = 500% of Buffer Cache  (vlot 500)
 Buffer cache size  is  6007521   so the value  of  VLOT is  30037605. The value of VLOT is greater  than the value of Object Size(27629415) so Cache Read  was chosen.
 Hence, Smart Scan doesn't work fine.


NSMTIO: kcbism: islarge 1 next 0 nblks 27629415 type 2, bpid 3, kcbisdbfc 0 kcbnhl 131072 kcbstt 120150 keep_nb 0 nbuf 6007521 kcbnwp 2 pstt 0 Exadata 1 bsz 8192 objd 257239
NSMTIO: kcbimd: nblks 27629415 kcbstt 120150 kcbnbh 600752 kcbisdbfc 3 is_medium 0 objd 0
NSMTIO: kcbivlo: nblks 27629415 vlot 500 pnb 6007521 kcbisdbfc 0 is_large 0
NSMTIO: qertbFetch:[MTT < OBJECT_SIZE < VLOT]: Checking cost to read from caches(local/remote) and checking storage reduction factors (OLTP/EHCC Comp)
NSMTIO: kcbdpc:NoDirectRead:[CACHE_READ]: tsn: 10, objd: 257239, objn: 257239
ckpt: 0, nblks: 27629415, ntcache: 0, ntdist:0, Exadata:1, bsz:8192, rule 0, threshold 104857600, rule_enabled 0\nDirect Path for pdb 10 tsn 10 objd 257239 objn 257239
Direct Path 0 ckpt 0, nblks 27629415 ntcache 0 ntdist 0
Direct Path mndb 0 tdiob 630 txiob 0 tciob 32
Direct path diomrc 128 dios 2 kcbisdbfc 0
kcbdpc: kx 1.500000 kc 1.000000 lhs 27629415.000000 rhs_ckpt 27629415.000000 rhs_dp 27629415.000000 rio 1.000000 cached 0 Exadata 1 bsz 8192 offload_rule 0
threshold 104857600 rule_enabled 0
NSMTIO: Additional Info: VLOT=30037605
Object# = 257239, Object_Size = 27629415 blocks
SqlId = 4tcutpnddyvm7, plan_hash_value = 1180455967, Partition# = 0
NSMTIO: fple: FALSE, sage: TRUE, isTableSpaceOnSage: 1, invalid table sapce number: FALSE, prmsdec: TRUE, is kernel txn table space encrypted: -1,is enc_ktid encrypted: -1,
 is sage enabled based on data -layer checks: 1, isQesSageEnabled: TRUE




c. Conclusion

The different sizes of buffer caches of both instances caused  VLOTs to have the different values and it affected the decision on Direct Read.

SOLUTION

There are two solutions

1) Setting  the buffer cache size of both instances to be the same value

Or

2) Reducing  the vale of VLOT by changing the value of the parameter '_very_large_object_threshold'
    Currently, Object sizes of both instances are the same and  the size of buffer cache  of the 2nd instance is greater than it of the 1st instance.
    and  the customer said he/she can't reduce the size of buffer cache of the 2nd instance because it may affect the whole performance of the system.
    Under this circumstance, we can reduce  the value of VLOT by changing the value of the parameter '_very_large_object_threshold'
    For example, both instances has the same value,500, on the parameter  '_very_large_object_threshold' now.
    The 2nd instance  has  a value of VLOT  almost  similar to it of the 1st instance by reducing the value of the parameter '_very_large_object_threshold'  to 450.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值