Oracle shutdown immediate hang住 现象分析（转至dave的blog）
前几天群里的一个朋友说他的数据库shutdown immediate后不能关闭。 hang住了。 这种情况以前也遇到过一次，是测试库。 就直接shutdown abort了。 当然生产库不建议这么做。
在Oracle metalink上搜了一些有关执行shutdown immedate 命令hang住或者slow的文章。 整理如下。
一. BUG 导致shutdown immediate Hang住
1.1 Bug 6512622 SHUTDOWN IMMEDIATE hangs / OERI
This note gives a brief overview bug 6512622.
The content was last updated on: 18-NOV-2010
Click here for details of each of the sections below.
Oracle Server (Rdbms)
Range of versions believed to be affected
Versions BELOW 11.1
Versions confirmed as being affected
Generic (all / most platforms affected)
This issue is fixed in
1.2 Bug 5057695: Shutdown Immediate Very Slow To Close Database [ID 428688.1]
Apply one off Patch 5057695 available on My Oracle Support on top of 188.8.131.52, 10.2.0.2 or 10.2.0.3.
The fix is included as of the 10.2.0.4 patchset.
二. Shutdown immediate hang住的一般原因
Shutdown Immediate Hangs: Common Causes [ID 106474.1]
（2）Temp segment cleanup
Rollbacks: If a large dml statement is in progress it must roll back before the shutdown completes. In pre 8.1 versions of Oracle this can be made to go faster with a higher cleanup_rollback_entries, but be careful because too high a setting can cause resource contention during normal database operation.
Temp segment cleanup: If your tablespace used for temp sort segments is of type temporary (as shown by the CONTENTS column of DBA_TABLESPACES) the extents, once allocated, are never deallocated until shutdown. Instead they are tracked in the SGA. This helps reduce dictionary cache contention during normal database operation. Unfortunately, when we shutdown these extents must all be deallocated at once which can lead to the dictionary cache contention we avoided while we were running. The tablespace of type temporary was introduced in 7.3.
These two problems can also combine to make things even slower as a rollback will also deallocate extents, leading to further contention on the parts of the dictionary cache that track used and free extents.
How to Check Why Shutdown Immediate Hangs? [ID 164504.1]
A. While shutdown immediate is hanging
Start Server Manager (or SQL*Plus for 8i or higher)
SVRMGRL> connect internal (or SYSDBA for 8i or higher)
SVRMGRL> select * from x$ktuxe where ktuxecfl = 'DEAD';
This shows dead transactions that smon is looking to rollback.
B. Plan to shutdown again and gather some information. Before issuing the
shutdown immediate command set some events as follows:
SVRMGRL> connect internal
SVRMGRL> alter session set events '10046 trace name context forever,level 12';
SVRMGRL> alter session set events '10400 trace name context forever, level 1';
SVRMGRL> shutdown immediate;
10046 turns on extended SQL_TRACE for the shutdown process.
10400 dumps a systemstate every 5 minutes.
在Oracle 10g 和 11g的官方文档都没有搜到SVRMGRL命令的相关说明。 这个应该是8i版本里的命令。
Oracle sqlplus prelim 参数介绍
What To Do and Not To Do When 'shutdown immediate' Hangs [ID 375935.1]
The big problem in these situations is that it is noticed only after the shutdown immediate has been issued.
This kind of situation is mostly caused by 2 things:
1. a large query running at the shutdown moment.
2. a large transaction running at the shutdown moment.
Both have to complete in order for the database to be brought down when shutdown immediate is issued.
Actually, the files cannot be closed consistently because of one of the 2 possibilities above and, as such, the transition from OPEN to MOUNT is postponed until the files are closed, which means that either the large query completes or the large transaction is rolled back. This is not a hang, it is the expected behavior.
-- 这段讲的很清楚。 DB dismount时需要做的2个操作。
So, before issuing the shutdown immediate, it would be recommended to check the following views, especially when the database needs to be brought down for a very short period of time:
1. for large queries:
select count(*) from v$session_longops where time_remaining>0;
2. for large transactions:
select sum(used_ublk) from v$transaction;
A result greater than 0 for the first query and a large value returned for the second one would mean a relatively long time to wait until the shutdown immediate completes.
1. For the large queries situation, when the shutdown immediate is hanging, you can just bring down the database using: shutdown abort, as the database could be easily brought to a consistent state by:
startup restrict followed by shutdown immediate.
One should take the backup and/or do whatever else need to be done after the shutdown immediate.
2. For the second situation, the workaround cannot be applied, especially when it's needed to take a cold backup. The database must be closed in a consistent state in order to do this and the consistent state cannot be achieved until all the transactions have completed one way or another (commit/rollback).
As such, it's up to the local personnel to decide what to do, depending on the local needs.
It is very important to realize that: by shutting down a database you do not solve a performance problem caused by a large transaction. You are only making things worse.
There are situations when the database is brought down even when a large transaction/large recovery is taking place. Then it's brought up again and a new shutodwn is tried. Again, the shutdown immediate is hanging, for a very simple reason - the large recovery is still going on.
At this moment, the v$transaction view is not displaying anything. However, it is still possible to check the recovery operation by checking the:
select * from v$fast_start_transactions;
select * from v$fast_start_servers;
views. They are the ones that display the recovery status.
As such, when a large transaction is taking place, do not try successive shutdown aborts, startups and shutdown immediate. The hang will reoccur. The database must be consistent when the database is dismounted - performing successive shutdowns/startups is not helping at all, it's only making the recovery even more lengthy.
You should prevent these situations by notifying the users a shutdown will be done and no large operations should be started.
If a large operation has already started at the moment when you want to shutdown immedate, assess what would be faster - rollback the current situation or allow it to complete.
Shutdown Normal or Shutdown Immediate Hangs. SMON disabling TX Recovery [ID 1076161.6]
During a SHUTDOWN IMMEDIATE and SHUTDOWN NORMAL, SMON is cleaning up extents which are no longer needed and marking them as freed.
Either wait for SMON to clean up the free extents in the database as it shuts down or perform a SHUTDOWN ABORT to shutdown the instance. A SHUTDOWN ABORT will not perform a clean shutdown.
3. shutdown immediate
如果这样还是无法关闭的话，可以尝试手工提交（commit）和checkpoint（alter system checkpoint），如果还没有办法，就kill session 和 监听的进程（LOCAL=NO）。
Oracle 服务器 进程中的 LOCAL=NO 和 LOCAL=YES
在这篇文章里做了一点说明， 就是在服务器上连接数据库，它是不走监听的。 就是LOCAL=YES的进程。 对于非本地的连接，都是走监听。 在服务器上显示的是LOCAL=NO。
注意：对于已经建立的非本地监听连接，即LOCAL=NO。 关闭监听对它们是没有影响的。 相关进程仍然存在，通过它们与数据库连接的session仍然可以继续执行事务操作。