shutdown数据库前的检查工作

当我们需要关闭oracle数据库时,可以使用shutdown命令,shutdown有四个参数:abort,immediate,transactional,normal。缺省不带任何参数的shutdown命令表示shutdown normal。

 

有关四个参数的区别,可参见下表:



 

一般情况下,最常用的是使用shutdown immediate来关闭数据库。以下是正常关闭数据库时的日志信息:

Fri Jul 01 21:57:54 2011
Shutting down instance: further logons disabled
Fri Jul 01 21:57:55 2011
Stopping background process QMNC
Fri Jul 01 21:57:56 2011
Stopping background process CJQ0
Fri Jul 01 21:57:57 2011
Stopping background process MMNL
Fri Jul 01 21:57:58 2011
Stopping background process MMON
Fri Jul 01 21:57:59 2011
Shutting down instance (immediate)
License high water mark = 5
Fri Jul 01 21:57:59 2011
Stopping Job queue slave processes, flags = 7
Fri Jul 01 21:57:59 2011
Job queue slave processes stopped
Fri Jul 01 21:57:59 2011
ALTER DATABASE CLOSENORMAL
Fri Jul 01 21:57:59 2011
SMON: disabling tx recovery
SMON: disabling cache recovery
Fri Jul 01 21:58:00 2011
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Thread 1 closed at log sequence 61
Successful close of redo thread 1
Fri Jul 01 21:58:00 2011
Completed: ALTER DATABASE CLOSENORMAL
Fri Jul 01 21:58:00 2011
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
 

这里可以看到,在database从open到mount状态时,数据库需要先disable tx recovery(transaction recovery),再disable cache recovery(row cache recovery)。有时,在shutdown数据库时,alertlog会提示Waiting for smon to disable tx recovery,这就表示oracle正在等待SMON进程执行disable tx recovery步骤。

有2种可能会导致上述情况发生:

1.      在shutdown数据库时有耗时较长的查询操作正在执行。
2.      在shutdown数据库时有大的事务操作正在执行。

这2种情况会导致数据文件关闭时无法保持一致性,所以数据库只好等待查询结束或者大事务被回滚。在这种情况下,数据库看上去可能会像被hang住一样。

所以在执行shutdown immediate之前,需要进行检查:

1.      查看是否有大的查询正在执行:
select count(*) from v$session_longops where time_remaining>0;

2.      查看是否有大的事务正在执行:
select sum(used_ublk) from v$transaction;

如果数据库在shutdown immediate时被hang住:

1.如果当时正在执行大的查询,可以使用shutdown abort来关闭数据库,之后再执行startup restrict; shutdown immediate;

2.如果当时正在执行大的事务,这个事务只要未被提交或回滚,数据库是无法在consistent state下关闭的。换句话说,如果数据库中有大的事务正在执行,shutdown immediate会使数据库hang住,事务回滚。如果用shutdown abort则会使情况变得更糟,因为如果事务未完成,下次数据库启动时,会有large tx recovery,就是大事务的回滚,回滚的时间取决于事务的大小

另外使用shutdown abort后启动的数据库,在v$transaction中将查不到有关事务的信息。不过可以使用以下语句查询:

select * from v$fast_start_transactions;

select * from v$fast_start_servers;

附相关参考信息:

What To Do and Not To Do When 'shutdown immediate' Hangs [ID 375935.1]
 In this Document
  Goal
  Solution
  References

--------------------------------------------------------------------------------

Applies to:
Oracle Server - Enterprise Edition - Version:8.1.7.0 to 11.1.0.7 - Release: 8.1.7 to 11.1
Information in this document applies to any platform.
***Checked for relevance on 22-Jul-2010***
Goal
What to do when shutdown immediate appears to hang:
Sometimes, the message'Waiting for smon to disable tx recovery'is posted in the alert log.
This note only addresses situations when the apparent hang occurs when the database is going from OPEN to MOUNT, which is actually the most common situation.
If the apparent hang occurs at a different step, then this note does not apply.
Solution
The big problem in these situations is that it is noticed only after the shutdown immediate has been issued.
This kind of situation is mostly caused by 2 things:
1. a large query running at the shutdown moment.
2. a large transaction running at the shutdown moment.
 

Both have to complete in order for the database to be brought down when shutdown immediate is issued.
Actually, the files cannot be closed consistently because of one of the 2 possibilities above and, as such, the transition from OPEN to MOUNT is postponed until the files are closed, which means that either the large query completes or the large transaction is rolled back. This is not a hang, it is the expected behavior.
 

So, before issuing the shutdown immediate, it would be recommended to check the following views, especially when the database needs to be brought down for a very short period of time:
1. for large queries:
select count(*) from v$session_longops where time_remaining>0;
2. for large transactions:
select sum(used_ublk) from v$transaction;
A result greater than 0 for the first query and a large value returned for the second one would mean a relatively long time to wait until the shutdown immediate completes.
For the second situation, please also check step9 inNote117316.1to "guestimate" the time to rollback the transactions.

1.    For the large queries situation, when the shutdown immediate is hanging, you can just bring down the database using:shutdown abort, as the database could be easily brought to a consistent state by:
startup restrictfollowed byshutdown immediate.
One should take the backup and/or do whatever else need to be done after theshutdown immediate.
2.    For the second situation, the workaround cannot be applied, especially when it's needed to take a cold backup. The database must be closed in a consistent state in order to do this and the consistent state cannot be achieved until all the transactions have completed one way or another (commit/rollback).
As such, it's up to the local personnel to decide what to do, depending on the local needs.
It is very important to realize that: BY SHUTTING DOWN A DATABASE YOU DO NOT SOLVE A PERFORMANCE PROBLEM CAUSED BY A LARGE TRANSACTION. You are only making things worse.
There are situations when the database is brought down even when a large transaction/large recovery is taking place. Then it's brought up again and a new shutodwn is tried. Again, the shutdown immediate is hanging, for a very simple reason - the large recovery is still going on.
At this moment, the v$transaction view is not displaying anything.
However, it is still possible to check the recovery operation by checking the:
select * from v$fast_start_transactions;
and/or
select * from v$fast_start_servers;
views. They are the ones that display the recovery status.
As such, when a large transaction is taking place, do not try successive shutdown aborts, startups and shutdown immediate. The hang will reoccur. The database must be consistent when the database is dismounted - performing successive shutdowns/startups is not helping at all, it's only making the recovery even more lengthy.
 
You should prevent these situations by notifying the users a shutdown will be done and no large operations should be started.
If a large operation has already started at the moment when you want to shutdown immedate, assess what would be faster - rollback the current situation or allow it to complete.

step9 inNote 117316.1

9. To check for rollback:

select used_ublk from v$transaction where ADDR=<value from TADDR in v$session>;
If there is a value there, this is the number of undo blocks used by the transaction. Wait one minute and again select "used_ublk" from "v$transaction" where ADDR=<value from TADDR in v$session>; .
Note the value. If it is decreasing, a rollback is occuring and based on the difference between these values, you can "guesstimate" the time required to complete the rollback. For example, if the first query returns a value of 80000 and the second one returns 70000, it took 1 minute to rollback 10000 blocks. Based on this number, you can guestimate the time to complete the rollback. In this case, it would be 7 minutes.

 

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
SHUTDOWN NORMAL or SHUTDOWN IMMEDIATE hangs

看看alertlog中有没有这样的记录

        Shutting down instance (immediate)
        License high water mark = 12
        Thu Dec  8 18:43:16 1994
        alter database  close normal
        Thu Dec  8 18:43:17 1994
        SMON: disabling tx recovery
        SMON: disabling cache recovery or
    waiting for smon to disable tx recovery --这表明smon在清理一些事务信息

通常在shutdown immediate和shutdown normal时要清理extents信息,比如你进行了一个大排序,只有在重起数据库时才释放使用的空间(但文件还是那么大,不会自动变小),所以就要花很多时间来进行清理,另外如果执行了一个大事务时发布shutdown immedaite(这时数据库就会想所有影子进程发送一个消息,要他们完成回滚或者清理工作),所以就需要很长时间的回滚工作。

对于大的排序,你在shutdown immedaite在等待smon处理时还没有正在关掉可以查看
select count(block#) from fet$;
select count(block#) from uet$;
数字是否在变小,如果变小,说明是在清理工作

 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~``

select * from v$fast_start_transactions;

直到状态由RECOVERING变成RECOVERED