从alert_SID.log看Oracle数据库启动三阶段

最新推荐文章于 2022-03-19 22:09:10 发布

xiaozhongma

最新推荐文章于 2022-03-19 22:09:10 发布

阅读量1.9k

点赞数

分类专栏：数据库文章标签： oracle 数据库 instance

数据库专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Oracle数据库启动是分为三个不同的阶段的，分别为nomount、mount和open。在不同的阶段，Oracle进行不同的行为，执行不同的操作。在笔者《Oracle启动三阶段》（http://space.itpub.net/17203031/viewspace-684503）系列中，已经进行一定程度的说明。本篇打算从Alert_log，也就是数据库自身的日志中，分析启动三阶段的不同行为和操作。

alert_SID.log是Oracle数据库运行阶段的运行日志。在Oracle工作时，会将所有会影响到整体性能的操作、变化和错误信息记录在一个日志中，就是Alert_log。在Oracle11g中，提供了文本文件和XML格式文件两种日志格式。对我们来说，alert_SID.log是进行数据库故障诊断、监控和调整的重要研究对象。

一、环境准备

采用Oracle11gR2 For Linux版本进行研究，具体过程如下。

Last login: Sun Apr 3 10:51:03 2011

[root@oracle11g~]# ps -ef | grep ora //当前后台没有与oracle相关的进程；

root 5514 5481 0 08:28 pts/0 00:00:00 grep ora

[root@oracle11g~]#

[root@oracle11g~]# su - oracle

[oracle@oracle11g~]$ env | grep ORACLE//启动所必需使用的环境变量；

ORACLE_SID=wilson

ORACLE_BASE=/u01

ORACLE_HOME=/u01/oracle

[oracle@oracle11g~]$

二、启动阶段操作

分阶段进行启动操作，之后我们查看效果。

1、nomount阶段

[oracle@oracle11g~]$ sqlplus /nolog

SQL*Plus: Release11.2.0.1.0 Production on Tue Apr 5 08:31:25 2011

SQL> conn / as sysdba;

Connected to an idle instance. //当前实例未启动

SQL> startup nomount

ORACLE instance started.//实例启动，共享内容空间分配！

Total System Global Area 414298112 bytes

Fixed Size 1336904 bytes

Variable Size 310380984 bytes

Database Buffers 96468992 bytes

Redo Buffers 6111232 bytes

此时，我们观察后台活动进程情况，可以看到。

[oracle@oracle11g~]$ ps -ef | grep ora

oracle 5594 1 0 08:32 ? 00:00:00 ora_pmon_wilson

oracle 5596 1 0 08:32 ? 00:00:00 ora_vktm_wilson

oracle 5600 1 0 08:32 ? 00:00:00 ora_gen0_wilson

oracle 5602 1 0 08:32 ? 00:00:00 ora_diag_wilson

oracle 5604 1 0 08:32 ? 00:00:00 ora_dbrm_wilson

oracle 5606 1 0 08:32 ? 00:00:00 ora_psp0_wilson

oracle 5608 1 0 08:32 ? 00:00:00 ora_dia0_wilson

oracle 5610 1 2 08:32 ? 00:00:00 ora_mman_wilson

oracle 5612 1 0 08:32 ? 00:00:00 ora_dbw0_wilson

oracle 5614 1 0 08:32 ? 00:00:00 ora_lgwr_wilson

oracle 5616 1 0 08:32 ? 00:00:00 ora_ckpt_wilson

oracle 5618 1 0 08:32 ? 00:00:00 ora_smon_wilson

oracle 5620 1 0 08:32 ? 00:00:00 ora_reco_wilson

oracle 5622 1 0 08:32 ? 00:00:00 ora_mmon_wilson

oracle 5624 1 0 08:32 ? 00:00:00 ora_mmnl_wilson

oracle 5626 1 0 08:32 ? 00:00:00 ora_d000_wilson

oracle 5628 1 0 08:32 ? 00:00:00 ora_s000_wilson

oracle 5630 1 0 08:32 ? 00:00:00 ora_s001_wilson

oracle 5632 5547 1 08:32 ? 00:00:00 oraclewilson (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

在nomount状态，虽然没有涉及到数据文件读取、控制文件读取等内容。但是最重要的数据库实例已经建立。核心后台进程PMON、DBWn、LGWr、SMON和CKPT已经建立，共享内存SGA区已经分配。

2、Mount状态

进入mount状态

SQL> alter database mount;

Database altered.

3、Open状态

进入Open状态

SQL> alter database open;

Database altered.

SQL> select count(*) from scott.emp;

COUNT(*)

----------

三、日志分析

根据系统参数可以获取到alert_log的路径，文件名默认为alert_<sid>.log，其中sid为数据库的唯一标志。在实验环境中，数据库名称为alert_wilson.log。

SQL> show parameter dump_dest;

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

background_dump_dest string /u01/diag/rdbms/wilson/wilson/

trace

core_dump_dest string /u01/diag/rdbms/wilson/wilson/

cdump

user_dump_dest string /u01/diag/rdbms/wilson/wilson/

trace

[oracle@oracle11g~]$ date

Tue Apr 5 08:37:40 CST 2011

根据时间间隔进行整理日志，我们看到三个阶段数据库操作情况。

1、nomount阶段

nomount阶段进行的主要是Oracle实例的构建，共享内存的开启，以及初始化参数的读取定位。下面为日志相关部分片段。

Tue Apr 05 08:32:27 2011

Starting ORACLE instance (normal)

LICENSE_MAX_SESSION = 0

LICENSE_SESSIONS_WARNING = 0

Picked latch-free SCN scheme 2

Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST

Autotune of undo retention is turned on.

IMODE=BR

ILAT =27

LICENSE_MAX_USERS = 0

SYS auditing is disabled

Starting up:

Oracle Database11gEnterpriseEdition Release11.2.0.1.0 - Production

With the Partitioning, OLAP, Data Mining and Real Application Testing options.

Using parameter settings in server-side spfile /u01/oracle/dbs/spfilewilson.ora

System parameters with non-default values:

processes = 150

memory_target =396M

control_files = "/u01/oradata/WILSON/controlfile/o1_mf_6bcsqm50_.ctl"

control_files = "/u01/flash_recovery_area/WILSON/controlfile/o1_mf_6bcsqmd8_.ctl"

db_block_size = 8192

compatible = "11.2.0.0.0"

db_create_file_dest = "/u01/oradata"

db_recovery_file_dest = "/u01/flash_recovery_area"

db_recovery_file_dest_size=3852M

undo_tablespace = "UNDOTBS1"

remote_login_passwordfile= "EXCLUSIVE"

db_domain = ""

dispatchers = "(PROTOCOL=TCP) (SERVICE=wilsonXDB)"

shared_servers = 2

audit_file_dest = "/u01/admin/wilson/adump"

audit_trail = "DB"

db_name = "wilson"

open_cursors = 300

diagnostic_dest = "/u01"

Tue Apr 05 08:32:30 2011

PMON started with pid=2, OS id=5594

Tue Apr 05 08:32:30 2011

VKTM started with pid=3, OS id=5596 at elevated priority

VKTM running at (10)millisec precision with DBRM quantum (100)ms

Tue Apr 05 08:32:30 2011

GEN0 started with pid=4, OS id=5600

Tue Apr 05 08:32:30 2011

DIAG started with pid=5, OS id=5602

Tue Apr 05 08:32:30 2011

DBRM started with pid=6, OS id=5604

Tue Apr 05 08:32:31 2011

PSP0 started with pid=7, OS id=5606

Tue Apr 05 08:32:31 2011

DIA0 started with pid=8, OS id=5608

Tue Apr 05 08:32:31 2011

MMAN started with pid=9, OS id=5610

Tue Apr 05 08:32:31 2011

DBW0 started with pid=10, OS id=5612

Tue Apr 05 08:32:31 2011

CKPT started with pid=12, OS id=5616

Tue Apr 05 08:32:31 2011

LGWR started with pid=11, OS id=5614

Tue Apr 05 08:32:31 2011

SMON started with pid=13, OS id=5618

Tue Apr 05 08:32:31 2011

RECO started with pid=14, OS id=5620

Tue Apr 05 08:32:31 2011

MMON started with pid=15, OS id=5622

starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...

Tue Apr 05 08:32:31 2011

MMNL started with pid=16, OS id=5624

starting up 2 shared server(s) ...

ORACLE_BASE from environment = /u01

总结：在本阶段，Oracle忙于进行如下阶段顺序操作。

根据环境变量，确定初始化参数文件位置。根据环境变量中对ORACLE_SID、ORACLE_HOME等的设置，确定要启动数据库在物理磁盘上的位置。根据Oracle的OFS结构，来确定使用参数文件可能的位置。根据spfile<sid>.ora、spfile.ora、init<sid>.ora和init.ora的顺序来定位参数文件。
读取参数文件，确定基础的参数内容。通过读取参数文件，Oracle启动过程获取到该数据库的基本信息，包括控制文件位置、块大小、各种类型目录的具体位置、特殊表空间名称等内容。其中，对控制文件的定位，是进一步读取更多数据库信息，定位数据库文件的重要前置步骤；

2、mount阶段

在mount阶段，数据库日志的内容相对简单些了。

Tue Apr 05 08:33:09 2011

alter database mount

Tue Apr 05 08:33:13 2011

Successful mount of redo thread 1, with mount id 3868018373

Database mounted in Exclusive Mode

Lost write protection disabled

Completed: alter database mount

总结：mount阶段，Oracle主要进行的各类型文件（数据文件、日志文件）的定位寻找工作。通过在nomount阶段加载的控制文件信息，确定了当前系统的文件的具体位置已经SCN信息。在此阶段，可以进行数据恢复方面的一些工作。

3、Open阶段

Tue Apr 05 08:33:29 2011

alter database open

Beginning crash recovery of 1 threads

Started redo scan

Completed redo scan

read 18880 KB redo, 1853 data blocks need recovery

Started redo application at

Thread 1: logseq 87, block 4964

Recovery of Online Redo Log: Thread 1 Group 3 Seq 87 Reading mem 0

Mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_6bcsqtfj_.log

Mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_6bcsqtwv_.log

Completed redo application of 15.28MB

Completed crash recovery at

Thread 1: logseq 87, block 42725, scn 2965918

1853 data blocks read, 1853 data blocks written, 18880 redo k-bytes read

Tue Apr 05 08:33:33 2011

Thread 1 advanced to log sequence 88 (thread open)

Thread 1 opened at log sequence 88

Current log# 1 seq# 88 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_6bcsqpty_.log

Current log# 1 seq# 88 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_6bcsqqt0_.log

Successful open of redo thread 1

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

Tue Apr 05 08:33:34 2011

SMON: enabling cache recovery

Successfully onlined Undo Tablespace 2.

Verifying file header compatibility for11gtablespace encryption..

Verifying11gfile header compatibility for tablespace encryption completed

SMON: enabling tx recovery

Database Characterset is AL32UTF8

No Resource Manager plan active

replication_dependency_tracking turned off (no async multimaster replication found)

Starting background process QMNC

Tue Apr 05 08:33:38 2011

QMNC started with pid=21, OS id=5721

Tue Apr 05 08:33:40 2011

Completed: alter database open

总结：在进入Open阶段之后，Oracle进行了一系列的工作。

由于上次关闭数据库非正常关闭。Oracle会自动检查是否需要进行前滚、后滚的回复操作。此时，控制文件、日志文件以及各个数据文件上的检查点和SCN记录就是一个重要依据。在本段代码中，Oracle发现需要进行crash恢复，并且使用Group3、顺序号为87的重做日志进行恢复；
确定当前的Oracle Redo Log。进行Seq87的恢复后，确定Seq88作为当前使用的Redo Log记录；
其他组件进程启动；

应该说，Oracle的启动是一个相当复杂精密、且层层相关的过程。深入的进行alert_SID.log分析，可以帮助我们更好的理解Oracle内部机制和机理，进而提高我们使用数据库、解决数据库问题的能力。

转自：http://space.itpub.net/17203031/viewspace-691806