LATCH ROW CACHE OBJECT问题分析
客户的系统,先后发生了几次Latch:row cache object等待,该等待会导致CPU迅速达到100%。
SQL> select event,p1,P1RAW from v$session where event='latch: row cache objects';
EVENT P1 P1RAW
-------------------------------------------- ---------- ----------------
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
latch: row cache objects 5.0440E+17 07000018A74C3918
我们可以从p1raw的值中可以看到,几乎全部的p1raw等待的地址都是07000018A74C3918。继续通过这个地址查询基表,我们发现全部的等待都集中在数据字典dc_user上。
SQL> SELECT kqrsttxt PARAMETER, kqrstcid CACHE#, kqrstcln "Lchild#", kqrstgrq "DCGets", l.gets "LGets", l.misses "Misses" FROM X$KQRST, V$LATCH_CHILDREN l WHERE l.addr='07000018A74C3918' and l.child#=KQRSTCLN ORDER BY 1,2;
PARAMETER CACHE# Lchild# DCGets LGets Misses
--------------------------- ---------- ---------- ---------- ----------
dc_users 7 7 -432944865 2494354363 3396983919
dc_users 7 7 0 2494354363 3396983919
dc_users 7 7 -488978240 2494354363 3396983919
dc_users 7 7 6423076 2494354363 3396983919
出问题时间段AWR报告的情况。latch: row cache objects占据了整个等待的53.5%
Top 5 Timed Events
Event | Waits | Time(s) | Avg Wait(ms) | % Total Call Time | Wait Class |
---|---|---|---|---|---|
latch: row cache objects | 3,074,567 | 292,212 | 95 | 53.5 | Concurrency |
CPU time | 147,945 | 27.1 | |||
db file sequential read | 4,342,112 | 37,436 | 9 | 6.9 | User I/O |
gc buffer busy | 2,573,835 | 10,621 | 4 | 1.9 | Cluster |
read by other session | 493,143 | 2,536 | 5 | .5 | User I/O |
数据字典使用情况
Dictionary Cache Stats
- “Pct Misses” should be very low (< 2% in most cases)
- “Final Usage” is the number of cache entries being used
Cache | Get Requests | Pct Miss | Scan Reqs | Pct Miss | Mod Reqs | Final Usage |
---|---|---|---|---|---|---|
dc_awr_control | 148 | 0.00 | 0 | 2 | 1 | |
dc_database_links | 2,361 | 0.00 | 0 | 0 | 1 | |
dc_files | 36,460 | 0.00 | 0 | 0 | 3,646 | |
dc_global_oids | 2,833 | 0.28 | 0 | 0 | 30 | |
dc_histogram_data | 36,128 | 3.26 | 0 | 0 | 30,437 | |
dc_histogram_defs | 54,038 | 3.81 | 0 | 0 | 28,382 | |
dc_object_grants | 652 | 3.07 | 0 | 0 | 65 | |
dc_object_ids | 94,802 | 0.32 | 0 | 0 | 4,453 | |
dc_objects | 8,349 | 3.03 | 0 | 59 | 2,702 | |
dc_profiles | 1,176 | 0.00 | 0 | 0 | 2 | |
dc_rollback_segments | 961,666 | 0.00 | 0 | 0 | 925 | |
dc_segments | 23,648 | 39.61 | 0 | 1,535 | 69,506 | |
dc_sequences | 739 | 82.14 | 0 | 739 | 224 | |
dc_tablespace_quotas | 4,865 | 4.07 | 0 | 0 | 162 | |
dc_tablespaces | 149,688 | 0.00 | 0 | 0 | 229 | |
dc_usernames | 24,338 | 0.00 | 0 | 0 | 979 | |
dc_users | 121,991,028 | 0.00 | 0 | 0 | 2,929 | |
outstanding_alerts | 2,735 | 97.95 | 0 | 6 | 452 |
正常时间段AWR报告
Top 5 Timed Events
Event | Waits | Time(s) | Avg Wait(ms) | % Total Call Time | Wait Class |
---|---|---|---|---|---|
db file sequential read | 5,091,217 | 41,713 | 8 | 65.6 | User I/O |
CPU time | 18,880 | 29.7 | |||
gc buffer busy | 1,098,325 | 4,530 | 4 | 7.1 | Cluster |
db file scattered read | 269,197 | 1,225 | 5 | 1.9 | User I/O |
db file parallel write | 1,113,985 | 1,060 | 1 | 1.7 | System I/O |
正常时间段数据字典使用情况
Dictionary Cache Stats
- “Pct Misses” should be very low (< 2% in most cases)
- “Final Usage” is the number of cache entries being used
Cache | Get Requests | Pct Miss | Scan Reqs | Pct Miss | Mod Reqs | Final Usage |
---|---|---|---|---|---|---|
dc_awr_control | 67 | 1.49 | 0 | 1 | 1 | |
dc_database_links | 2,801 | 0.00 | 0 | 0 | 1 | |
dc_files | 76,566 | 0.00 | 0 | 0 | 3,646 | |
dc_global_oids | 1,926 | 0.00 | 0 | 0 | 22 | |
dc_histogram_data | 22,093 | 5.27 | 0 | 0 | 29,261 | |
dc_histogram_defs | 26,514 | 6.77 | 0 | 0 | 27,758 | |
dc_object_grants | 96 | 0.00 | 0 | 0 | 45 | |
dc_object_ids | 91,305 | 0.15 | 0 | 0 | 4,229 | |
dc_objects | 5,241 | 1.74 | 0 | 74 | 2,599 | |
dc_profiles | 1,396 | 0.00 | 0 | 0 | 2 | |
dc_rollback_segments | 376,713 | 0.00 | 0 | 0 | 925 | |
dc_segments | 22,509 | 55.27 | 0 | 1,992 | 60,544 | |
dc_sequences | 676 | 86.09 | 0 | 676 | 243 | |
dc_tablespace_quotas | 6,815 | 2.61 | 0 | 0 | 163 | |
dc_tablespaces | 83,635 | 0.00 | 0 | 0 | 229 | |
dc_usernames | 22,585 | 0.00 | 0 | 0 | 978 | |
dc_users | 69,522,674 | 0.00 | 0 | 0 | 2,928 | |
outstanding_alerts | 2,730 | 98.02 | 0 | 0 | 5 |
经过正常时间段和异常时间段相比,发现dc_users的request的数量从69522674上升到了121991028次,多了一倍,正好和我们前面查询出的结果吻合。
通过MOS文档Heavy “latch: row cache objects” Contention on dc_objects and dc_users when Using Roles and System Level Grants (文档 ID 1639293.1)中间的描述。
Since a set of privileges grouped together in a role cannot be static and can change frequently, the role cannot be cached as a library_cache object. Additionally, because the role includes system level privileges it needs to be looked up during every execution of the cursor irrespective of whether it is cached or not.
Under heavy concurrent execution this can lead to high numbers or lookups for object privileges and cause ‘latch: row cache objects’ contention
If you are granting ‘GRANT ANY …’ privileges to users then, because of an optimization introduces in Bug 10023443, if you have thousands of different users executing RPCs without direct privileges then a large set of validation data has to be scanned taking a lot of time and leading to high row cache contention. Granting the privileges directly to the object avoids this contention.
Solution
If these symptoms occur, explicitly grant access to the underlying object to the user (not through the role):
这里提到如果用户不是当前的用户,而大批量执行会导致大量对对象权限的检查,引起latch:row cache objects,一个有效的避免办法就是直接赋予对象授权。
我们在看高并发一个时间段具体的情况,这里可以看到user_id集中在120上,总共有69个会话同一时间发起对不同对象的请求。
SQL> select USER_ID,to_char(sample_time,'YYYY-MM-DD HH24:MI:ss') atime,event,CURRENT_OBJ#
from dba_hist_active_sess_history where SNAP_ID >46342 and SNAP_ID<=46343
and event='latch: row cache objects' and to_char(sample_time,'YYYY-MM-DD HH24:MI:ss')='2014-12-31 10:53:41' and CURRENT_OBJ# not in (-1,0);
USER_ID ATIME EVENT CURRENT_OBJ#
---------- ------------------- ---------------------------------------------------------------- ------------
120 2014-12-31 10:53:41 latch: row cache objects 7956834
120 2014-12-31 10:53:41 latch: row cache objects 7956834
120 2014-12-31 10:53:41 latch: row cache objects 8107750
120 2014-12-31 10:53:41 latch: row cache objects 8435865
120 2014-12-31 10:53:41 latch: row cache objects 7956834
120 2014-12-31 10:53:41 latch: row cache objects 8107765
120 2014-12-31 10:53:41 latch: row cache objects 1855183
120 2014-12-31 10:53:41 latch: row cache objects 8431388
120 2014-12-31 10:53:41 latch: row cache objects 8107772
120 2014-12-31 10:53:41 latch: row cache objects 8107741
120 2014-12-31 10:53:41 latch: row cache objects 8107747
123 2014-12-31 10:53:41 latch: row cache objects 4924792
120 2014-12-31 10:53:41 latch: row cache objects 8107751
123 2014-12-31 10:53:41 latch: row cache objects 5199531
120 2014-12-31 10:53:41 latch: row cache objects 8431388
120 2014-12-31 10:53:41 latch: row cache objects 8431389
120 2014-12-31 10:53:41 latch: row cache objects 8107762
120 2014-12-31 10:53:41 latch: row cache objects 8107770
120 2014-12-31 10:53:41 latch: row cache objects 8107758
120 2014-12-31 10:53:41 latch: row cache objects 8107762
103 2014-12-31 10:53:41 latch: row cache objects 5138768
123 2014-12-31 10:53:41 latch: row cache objects 4924792
120 2014-12-31 10:53:41 latch: row cache objects 8428997
120 2014-12-31 10:53:41 latch: row cache objects 8107770
120 2014-12-31 10:53:41 latch: row cache objects 8431319
120 2014-12-31 10:53:41 latch: row cache objects 8107749
123 2014-12-31 10:53:41 latch: row cache objects 8470682
120 2014-12-31 10:53:41 latch: row cache objects 8107764
120 2014-12-31 10:53:41 latch: row cache objects 8107747
120 2014-12-31 10:53:41 latch: row cache objects 8107771
120 2014-12-31 10:53:41 latch: row cache objects 8435513
120 2014-12-31 10:53:41 latch: row cache objects 8472729
120 2014-12-31 10:53:41 latch: row cache objects 8107765
123 2014-12-31 10:53:41 latch: row cache objects 2540191
120 2014-12-31 10:53:41 latch: row cache objects 1882550
120 2014-12-31 10:53:41 latch: row cache objects 8107771
120 2014-12-31 10:53:41 latch: row cache objects 6916742
120 2014-12-31 10:53:41 latch: row cache objects 8428997
120 2014-12-31 10:53:41 latch: row cache objects 8107766
120 2014-12-31 10:53:41 latch: row cache objects 1886171
120 2014-12-31 10:53:41 latch: row cache objects 8107767
120 2014-12-31 10:53:41 latch: row cache objects 8107754
120 2014-12-31 10:53:41 latch: row cache objects 8164310
120 2014-12-31 10:53:41 latch: row cache objects 1882543
120 2014-12-31 10:53:41 latch: row cache objects 8107752
120 2014-12-31 10:53:41 latch: row cache objects 1855381
120 2014-12-31 10:53:41 latch: row cache objects 1248853
120 2014-12-31 10:53:41 latch: row cache objects 8107726
120 2014-12-31 10:53:41 latch: row cache objects 8107756
120 2014-12-31 10:53:41 latch: row cache objects 8107750
103 2014-12-31 10:53:41 latch: row cache objects 8427924
120 2014-12-31 10:53:41 latch: row cache objects 8435865
120 2014-12-31 10:53:41 latch: row cache objects 8107754
120 2014-12-31 10:53:41 latch: row cache objects 8428997
120 2014-12-31 10:53:41 latch: row cache objects 8431388
120 2014-12-31 10:53:41 latch: row cache objects 1855351
120 2014-12-31 10:53:41 latch: row cache objects 8107766
120 2014-12-31 10:53:41 latch: row cache objects 1889082
120 2014-12-31 10:53:41 latch: row cache objects 7624584
120 2014-12-31 10:53:41 latch: row cache objects 8428997
120 2014-12-31 10:53:41 latch: row cache objects 1855718
123 2014-12-31 10:53:41 latch: row cache objects 4924792
120 2014-12-31 10:53:41 latch: row cache objects 2540227
120 2014-12-31 10:53:41 latch: row cache objects 8435865
120 2014-12-31 10:53:41 latch: row cache objects 1905505
120 2014-12-31 10:53:41 latch: row cache objects 3026654
123 2014-12-31 10:53:41 latch: row cache objects 8157538
120 2014-12-31 10:53:41 latch: row cache objects 8107742
120 2014-12-31 10:53:41 latch: row cache objects 8107747
69 rows selected.
查询user_id为120用户的权限情况
SQL> select username from dba_users where user_id=120;
USERNAME
------------------------------
TUX_INTRFC
SQL> select * from dba_sys_privs where GRANTEE='TUX_INTRFC';
GRANTEE PRIVILEGE ADMIN_
--------------- ------------------------- ------
TUX_INTRFC SELECT ANY SEQUENCE NO
TUX_INTRFC ALTER ANY SEQUENCE NO
TUX_INTRFC ALTER SESSION NO
TUX_INTRFC CREATE SEQUENCE NO
TUX_INTRFC CREATE ANY INDEX NO
TUX_INTRFC SELECT ANY TABLE NO
TUX_INTRFC ALTER ANY TABLE NO
TUX_INTRFC CREATE ANY TABLE NO
TUX_INTRFC SELECT ANY DICTIONARY NO
TUX_INTRFC EXECUTE ANY INDEXTYPE NO
TUX_INTRFC ALTER ANY INDEX NO
TUX_INTRFC LOCK ANY TABLE NO
TUX_INTRFC CREATE ANY SEQUENCE NO
TUX_INTRFC DELETE ANY TABLE NO
TUX_INTRFC EXECUTE ANY PROCEDURE NO
TUX_INTRFC UPDATE ANY TABLE NO
TUX_INTRFC INSERT ANY TABLE NO
TUX_INTRFC UNLIMITED TABLESPACE NO
SQL> select * from dba_role_privs where GRANTEE='TUX_INTRFC';
GRANTEE GRANTED_ROLE ADMIN_ DEFAUL
--------------- ------------------------------------------------------------ ------ ------
TUX_INTRFC RESOURCE NO YES
TUX_INTRFC CONNECT NO YES
SQL> select * from DBA_TAB_PRIVS where GRANTEE='TUX_INTRFC';
no rows selected
这里可以看到TUX_INTRFC用户只具备系统权限,而不具备对象权限。这就会导致文档1639293.1中描述的情况产生。
解决办法
首先查出具体问题时间段影响最多的用户。这里可以看到120用户是影响最多的。
SQL> select user_id,count(1) from dba_hist_active_sess_history where SNAP_ID >46342 and SNAP_ID<=46343
and event='latch: row cache objects' group by user_id order by 2;
USER_ID COUNT(1)
---------- ----------
53 1
149 1
71 2
122 3
0 4
72 5
74 5
46 5
70 6
77 6
75 6
76 7
73 10
32 15
64 107
58 122
62 124
65 126
59 127
60 129
66 130
67 130
55 131
57 135
69 136
63 136
56 140
54 140
68 146
61 151
103 217
123 1573
120 24998
出问题时间段120用户访问表对象。
SQL> select owner,object_name from
2 (
3 select distinct CURRENT_OBJ#
4 from dba_hist_active_sess_history where SNAP_ID >46342 and SNAP_ID<=46343
5 and event='latch: row cache objects' and user_id=120 and CURRENT_OBJ# not in (1,0)
6 ) b ,(select owner,object_name,object_type,object_id from dba_objects) a
7 where a.object_id = b.CURRENT_OBJ#
8 and object_type='TABLE';
OWNER OBJECT_NAME
------------------------------ ------------------------------
FFPDUSER CP_KZH_MEMBER_INFO
FFPDUSER CP_KZH_FAMILY_INFO
FFPDUSER C_IDS_ID_CHGCON_TBL
FFPDUSER C_PRNT_CNTRCT_SUSPEND_CHNG_END
FFPDUSER C_IDS_ID_APP_POS_TBL
FFPDUSER C_IDS_PRE_CDELREC_TBL
FFPDUSER CP_GPS_RQSTING
FFPDUSER C_PRNT_SPEC_TERM_CHNG
FFPDUSER CP_CNTRCT_DETAIL
FFPDUSER CP_INTRFC_BIZUNIT_AUTH
FFPDUSER C_INTRFC_PREM_RENEW_TBL
FFPDUSER CT_QUA_EMPNO
FFPDUSER CP_FINANCIAL_DTL
FFPDUSER CP_PARTICIPATION
FFPDUSER CP_PAYROLL
FFPDUSER C_PRNT_ACCNT_STATEMENT
FFPDUSER CP_HISTORICAL_NOTE
FFPDUSER CP_CLAUSE
FFPDUSER C_IDS_ADDR_TBL
FFPDUSER C_IDS_SPREREC_TBL
FFPDUSER C_IDS_RISKCON_TBL
FFPDUSER CP_LIFE_SURVEY
FFPDCMS CP_ELECTRONIC_ADDR
FFPDUSER AGT_CREDIT_GRADE
FFPDUSER C_IDS_RISKCON_RENEW_TBL
FFPDUSER C_IDS_ACCTLIST_TBL
FFPDUSER C_IDS_PHONE_TBL
FFPDUSER C_IDS_PERSON_TBL
FFPDCPICDATA IDL_WL_PYEAR_PAY2010
FFPDUSER CP_PRSN_WEB
FFPDUSER C_APP_NB_TBL
FFPDUSER CP_ACCEPT_INFO
FFPDUSER C_PRNT_CNTRCT_REPAY_END
FFPDUSER C_PRNT_PRSN_MSG_CHNG_END
FFPDCPICDATA IDL_WL_PYEAR_PAY2009
FFPDCPICDATA IDL_WL_RISKBONUS_ENG
FFPDMASTER CD_CVRG_BENEFIT
FFPDMASTER CD_CLAUSE_LINK
FFPDMASTER CD_VARIABLE
FFPDUSER CP_CLOB_NO_DATE
FFPDUSER CP_CLOB_FULL_NO_PRPGT
FFPDUSER C_IDS_PARTY_BRANCH_RLSHIP_TBL
FFPDUSER CP_KZH_TJ_INQUIRY
FFPDUSER CP_KZH_TJ_PRODUCT
FFPDUSER C_INTRFC_WX_SMS_TBL
FFPDCMS CP_ORG
FFPDUSER CO_DISTRIBUTOR
FFPDUSER CP_CLAUSE_LINK
FFPDUSER CP_CVRG_BENEFIT_PAYOUT_DTL
FFPDUSER CP_DVDND_PAYOUT_DTL
FFPDUSER CP_LNK
FFPDUSER CP_LOAN
FFPDUSER CP_SECURITY_KEY_DISTRIBUTOR
FFPDUSER CP_STTLMNT_CVRG
FFPDUSER C_IDS_APP_NB_TBL
FFPDUSER C_IDS_MONEYSCH_TBL
FFPDUSER C_IDS_POLICYST_TBL
FFPDUSER C_IDS_TMPPAY_TBL
使用下列语法赋予相关对象权限
grant select on FFPDUSER.CP_KZH_MEMBER_INFO to TUX_INTRFC;
grant insert on FFPDUSER.CP_KZH_MEMBER_INFO to TUX_INTRFC;
grant update on FFPDUSER.CP_KZH_MEMBER_INFO to TUX_INTRFC;
grant delete on FFPDUSER.CP_KZH_MEMBER_INFO to TUX_INTRFC;