Troubleshooting 'library cache: mutex X' waits. [ID 1357946.1]

最新推荐文章于 2021-05-21 10:05:33 发布

cuhongyang4157

最新推荐文章于 2021-05-21 10:05:33 发布

阅读量174

点赞数

文章标签：数据库

修改时间 22-SEP-2011 类型 TROUBLESHOOTING 状态 PUBLISHED

In this Document
  Purpose
  Last Review Date
  Instructions for the Reader
  Troubleshooting Details
     What is a 'library cache: mutex X' wait?
     What causes 'library cache: mutex X' wait?
     How to Examine the Diagnostics.
     Potential Solutions
  References

Applies to:

Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.2 - Release: 11.1 to 11.2
Information in this document applies to any platform.

Purpose

The purpose of the note is to help customers troubleshoot library cache: mutex X.

Last Review Date

September 12, 2011

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

Troubleshooting Details

What is a 'library cache: mutex X' wait?

Library cache: mutex X is similar to library cache wait in earlier version. Library cache: mutex X may be caused by many issues. So it is important to find the cause; so the right solution can be implemented.

What causes 'library cache: mutex X' wait?

* Frequent Hard Parses
If the frequency of Hard Parsing is extremely high, then contention can occur on this pin.
* High Version Counts
When Version counts become excessive, a long chain of versions needs to be examined and this can lead to contention on this event
* Invalidations and reloads
*Known Bugs

How to diagnose the cause.1. Check to see if anything has changed:

     a. increased load
     b. any change in the application, os, or middle tier
     c. any os change

2. Is there a trend to library cache: mutex X:

    a. is there a certain time of the day when this wait is seen?
    b. does something trigger this wait?

3. During the time of the issue, run AWR and ADDM. Also obtain the baseline to compare the load, parameter changes, and any other differences.

To gather this it is suggested to run AWR and ADDM for half an hour to an hour interval as follows:

SQL>@$ORACLE_HOME/rdbms/admin/awrrpt.sql

SQL>@$ORACLE_HOME/rdbms/admin/addmrpt.sql

4. Sometimes system state dump is necessary to match known issues. For example, if there is no obvious candidate SQL in AWR, capturing holder or waiter processes in systemstate allows you to focus in on potential problems. Run system state when processes appear hung on 'library cache: mutex X':

(a) Non-Rac
sqlplus "/ as sysdba"

oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
wait 90 seconds
oradebug dump systemstate 266
quit

(b) RAC
$ sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug setinst all
oradebug -g all hanganalyze 4
oradebug -g all dump systemstate 266
quit

5. Errorstacks: Another way to obtain process information is with errorstack. Assuming you can identify a blocker, taking errorstacks will provide much the same information as systemstates but with a much reduced disk footprint for trace. Once the ospid of the blocker has been found, an errorstack can be generated:

$ sqlplus
SQL> oradebug setospid
oradebug dump errorstack 3
<< wait 1min>>
oradebug dump errorstack 3
<< wait 1min>>
oradebug dump errorstack 3
exit

In particular, the stack from the resultant trace can be used to match known issues.

The system state and errorstacks are not easily readable; so a Service Request may need to be opened to read the files.

6. Sometimes it is not feasible to run system state dump, as it may be resource intensive. So the following sql can also be ran in interval:

select s.sid, t.sql_text
from v$session s, v$sql t
where s.event like '%mutex%'
and t.sql_id = s.sql_id

Check to see what sessions are waiting on.

7. In 11g RAC, there is another less resource intensive tool that can be used when compared with taking system state dumps:

Note:459694.1 Procwatcher: Script. to Monitor and Examine Oracle DB and Clusterware Processes

How to Examine the Diagnostics.

1. Look for high parsing and high version counts from AWR.
Click on *SQL Statistics under Main Report of AWR:

Main Report

* Report Summary
* Wait Events Statistics
* SQL Statistics
* Instance Activity Statistics
* IO Stats
* Buffer Pool Statistics
* Advisory Statistics
* Wait Statistics
* Undo Statistics
* Latch Statistics
* Segment Statistics
* Dictionary Cache Statistics
* Library Cache Statistics
* Memory Statistics
* Streams Statistics
* Resource Limit Statistics
* init.ora Parameters

Then, under SQL Statistics click on 'SQL ordered by Version Count' and 'SQL ordered by Version Count' to view that information:

SQL Statistics

* SQL ordered by Elapsed Time
* SQL ordered by CPU Time
* SQL ordered by Gets
* SQL ordered by Reads
* SQL ordered by Executions
* SQL ordered by Parse Calls
* SQL ordered by Sharable Memory
* SQL ordered by Version Count
* SQL ordered by Cluster Wait Time
* Complete List of SQL Text

Check for high parse calls.

SQL ordered by Parse Calls

* Total Parse Calls: 2,935,390
* Captured SQL account for 95.6% of Total

Parse Calls Executions % Total Parses SQL Id SQL Module SQL Text
668,174 668,014 22.76 gpkdazvcfwkv select DATA_TABLE (...
667,440 668,234 22.74 5p9vjzht9jqb INSERT INTO DATA_TABLE (DATA_I...

Check to see if there is high parse calls to execute. Ideally, there should be less parse to executions.

SQL ordered by Version Count

* Only Statements with Version Count greater than 20 are displayed

Version Count Executions SQL Id SQL Module SQL Text
277 392,737 30d5a6v16mpb select FLOW_CONTEXT (...
78 131,104 7c0gj35488xs INSERT INTO PROJECT (DOC_ID, ...

From this list, investigate the SQLs with the high version count. What are the reasons that these statements are not shared? Can this be addressed?

Potential Solutions

1. Check for high hard parsing, as this can cause can reloads in the sql area. Check the hard parse under the load profile:

Load Profile

Per Second Per Transaction
Redo size: 1,923,746.18 18,078.39
Logical reads: 585,678.81 5,503.91
Block changes: 7,116.40 66.88
Physical reads: 602.31 5.66
Physical writes: 282.49 2.65
User calls: 1,160.80 10.91
Parses: 504.43 4.74
Hard parses: 0.61 0.01
Sorts: 25.34 0.24
Logons: 0.06 0.00
Executes: 505.51 4.75
Transactions: 106.41

Also check for high reloads in the sql area:

Library Cache Activity

* "Pct Misses" should be very low

Namespace Get Requests Pct Miss Pin Requests Pct Miss Reloads Invali- dations
BODY 105 3.81 3,297 0.49 12 0
CLUSTER 105 2.86 209 2.87 3 0
INDEX 166 19.28 400 12.75 19 0
SQL AREA 2,831 85.84 4,568,649 0.22 1,956 533
TABLE/PROCEDURE 3,075 12.94 1,831,653 0.15 1,147 0

If there is high reload, then check to see if the shared pool or sga_target is large enough.

If the shared pool is too small, the sqls will age out and hard parses will be higher.

The following note is helpful in tuning the shared pool:

Note:62143.1 Understanding and Tuning the Shared Pool

Also check for application that are not sharing sqls or using literals. In this case, consider using bind variables in place of literals.

2. Check for invalidations under Library Cache Activity. If the invalidation has high number, then check ddl's performed during the time such as truncate, drop, grants, dbms_stats, etc.

3. High version counts can also cause 'cursor: pin S wait on X' waits.

Check V$SQL_SHARED_CURSOR to see the potential reason for the high version count using:

Note:438755.1 Formated V$SQL_SHARED_CURSOR Report by SQLID or Hash Value
Note:296377.1 Troubleshooting: High Version Count Issues (Doc ID 296377.1)

4. Check the following note for relevant bugs under 'Known Bugs' and relevant version:

Note:727400.1WAITEVENT: "library cache: mutex X"

5. For 11g, make sure cursor_sharing is not similar, as it has been deprecated. This may also cause mutex waits:

Note:1169017.1ANNOUNCEMENT: Deprecating the cursor_sharing = ‘SIMILAR’ setting

6. If the database has been migrated from 10g to 11g and mutex performance issue surfaces, please consider the 11.2.0.2.2 psu + fix for 12431716. Many mutex fixes are already included in this patch:

Note:1291879.1 Oracle Database Patch Set Update 11.2.0.2.2 Known Issues

References

NOTE:1291879.1 - Oracle Database Patch Set Update 11.2.0.2.2 Known Issues
NOTE:296377.1 - Troubleshooting: High Version Count Issues
NOTE:438755.1 - Formatted V$SQL_SHARED_CURSOR Report by SQLID or Hash Value
NOTE:459694.1 - Procwatcher: Script. to Monitor and Examine Oracle DB and Clusterware Processes
NOTE:62143.1 - Troubleshooting: Tuning the Shared Pool and Tuning Library Cache Latch Contention
NOTE:727400.1 - WAITEVENT: "library cache: mutex X"

显示相关信息 相关内容