很早之前就遇到过几次这样的问题,今天又碰到了,转载一篇别人的blog记录一下。
数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for HPUX: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
问题如下
SQL> conn doudou/oracle (HANG住了)
查看等待事件
select count(*),
CASE
WHEN state != 'WAITING' THEN
'WORKING'
ELSE
'WAITING'
END AS state,
CASE
WHEN state != 'WAITING' THEN
'On CPU / runqueue'
ELSE
event
END AS sw_event
FROM v$session_wait
GROUP BY CASE
WHEN state != 'WAITING' THEN
'WORKING'
ELSE
'WAITING'
END,
CASE
WHEN state != 'WAITING' THEN
'On CPU / runqueue'
ELSE
event
END
ORDER BY 1 DESC, 2 DESC
library cache lock WAITING 585
rdbms ipc message WAITING 16
Space Manager: slave idle wait WAITING 3
jobq slave wait WAITING 2
Streams AQ: waiting for time management or cleanup tasks WAITING 1
VKRM Idle WAITING 1
smon timer WAITING 1
Streams AQ: qmn coordinator idle wait WAITING 1
pmon timer WAITING 1
Streams AQ: qmn slave idle wait WAITING 1
DIAG idle wait WAITED KNOWN TIME 1
DIAG idle wait WAITING 1
library cache lock WAITED KNOWN TIME 1
VKTM Logical Idle Wait WAITING 1
asynch descriptor resize WAITED SHORT TIME 1
SQL*Net message from client WAITING 1
结合等待事件去分析
1.library cache lock 等待严重,另一方面考虑只有单独的这个业务用户doudou不能登录,其他业务类型的用户doudou01不受任何影响。再次怀疑可能是11g 密码延迟机制导致的这个问题。
2.然后查看了一下用户修改密码的时间
select * from sys.user$ where name='DOUDOU';
PTIME=2013/11/6 11:22:09 --PTIME is the date the password was last changed
CTIME=2013/11/6 11:22:09 --CTIME is the date the user was created
从这里可以看出我们DOUDOU用户,没有修改过密码,但是为什么会出现大量的library cache lock,没有修改密码,但是新业务配置的用户密码会不会有错误呢,这样询问了开发人员,原来他们的配置有错误,用户密码配置错误了。也就是错误的用户和密码批量请求导致了大量的library cache lock。
MOS类似的案例
Applies to:
Oracle Database - Enterprise Edition - Version 10.2.0.5 and later
Information in this document applies to any platform.
***Checked for relevance on 07-Jul-2013***
Symptoms
'Library cache lock' or 'row cache lock' can be observed when concurrent users login with wrong password to the database.
The 'row cache lock' is seen in 10.2 and 11.1 while the 'library cache lock' is seen in 11.2.
Stack contains one of the following functions:
kziavua
kziaia
kziasfc
Checking the exclusive holder from DBA_DDL_LOCKS, there is always a session holding a lock type (kglhdnsp) 79 on object (kglnaobj) 5:
SQL> select * from dba_ddl_locks where mode_held='Exclusive';
SESSION_ID OWNER NAME TYPE MODE_HELD MODE_REQU
---------- --------- ---------- ---------- --------- ---------
612 5 79 Exclusive None
Changes
Concurrent users with wrong password try to login to the database.
Cause
1. Numerous failed logins attempts can cause row cache lock waits and/or library cache lock waits.
2. This was reported in Bug 11742803 Lots of 'Library cache lock' During User Logon Autentication.
For 11107, it was solved by applying following fix for unpublished Bug 7715339 Waits on Row Cache Lock on DC_USERS After Upgrade from 10gR2 to 11107 and is documented in the following:
Document:7715339.8 Logon failures causes "row cache lock" waits - Allow disable of logon delay
3. For 11107 and 11.2.0.1 as part of fixing bug 9720182, the management of passwords moved from the dc_users row cache to a new library cache.
Bug 9720182:
Note:9720182.8 Bug 9776608 Hang from concurrent login to same account with a wrong password
@Solution
1. Check for bad or incorrect password or login attack by running following sql:
select username,
os_username,
userhost,
client_id,
trunc(timestamp),
count(*) failed_logins
from dba_audit_trail
where returncode=1017 and --1017 is invalid username/password
timestamp < sysdate -7
group by username,os_username,userhost, client_id,trunc(timestamp);
Or run following sql:
SELECT "USERNAME", "OS_USERNAME", "USERHOST", "EXTENDED_TIMESTAMP",returncode FROM "SYS"."DBA_AUDIT_SESSION" WHERE returncode != 0;
2. Set the below event in the spfile or init.ora file and restart the database:
alter system set event ="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1" scope=spfile;
or
EVENT="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1"
Note:
Care should be taken when setting this event, as this is disabling the sleep time which can leave the system more vulnerable to brute force password guessing attacks.
For full information on this issue please read the following note before deciding to set the event:
Document:7715339.8 Logon failures causes "row cache lock" waits - Allow disable of logon delay
In 11.1.0.7, if One off fix Patch 7715339 is applied, the delay is disabled unconditionally, and there is no need to set the event.
References
NOTE:7715339.8 - Bug 7715339 - Logon failures causes "row cache lock" waits - Allow disable of logon delay
BUG:11742803 - LOTS OF 'LIBRARY CACHE LOCK' DURING USER LOGON AUTHENTICATION
3.问题解决,正确的用户密码配置之后,并设置参数EVENT="28401 TRACE NAME CONTEXT FOREVER, LEVEL 1",大量的library cache lock逐渐减少,最后消除。新业务也正常使用了
附表
user$ 视图解释
Test cases below show:
CTIME is the date the user was created.
LTIME is the date the user was last locked. (Note that it doesn't get NULLed when you unlock the user). --通过这个数是否增长判断是否还有用户尝试登陆数据库失败
PTIME is the date the password was last changed.
LCOUNT is the number of failed logins.
记录用户登录失败触发器:
CREATE OR REPLACE TRIGGER logon_denied_to_alert
AFTER servererror ON DATABASE
DECLARE
message VARCHAR2(168);
ip VARCHAR2(15);
v_os_user VARCHAR2(80);
v_module VARCHAR2(50);
v_action VARCHAR2(50);
v_pid VARCHAR2(10);
v_sid NUMBER;
v_program VARCHAR2(48);
BEGIN
IF (ora_is_servererror(1017)) THEN
-- get ip FOR remote connections :
IF upper(sys_context('userenv', 'network_protocol')) = 'TCP' THEN
ip := sys_context('userenv', 'ip_address');
END IF;
SELECT sid INTO v_sid FROM sys.v_$mystat WHERE rownum < 2;
SELECT p.spid, v.program
INTO v_pid, v_program
FROM v$process p, v$session v
WHERE p.addr = v.paddr
AND v.sid = v_sid;
v_os_user := sys_context('userenv', 'os_user');
dbms_application_info.read_module(v_module, v_action);
message := to_char(SYSDATE, 'YYYYMMDD HH24MISS') ||
' logon denied from ' || nvl(ip, 'localhost') || ' ' ||
v_pid || ' ' || v_os_user || ' with ' || v_program || ' – ' ||
v_module || ' ' || v_action;
sys.dbms_system.ksdwrt(2, message);
END IF;
END;
/