alert_SID.log是Oracle数据库运行阶段的运行日志。在Oracle工作时,会将所有会影响到整体性能的操作、变化和错误信息记录在一个日志中,就是Alert_log。在Oracle11g中,提供了文本文件和XML格式文件两种日志格式。对我们来说,alert_SID.log是进行数据库故障诊断、监控和调整的重要研究对象。
一、环境准备
采用Oracle11gR2 For Linux版本进行研究,具体过程如下。
Last login: Sun Apr 3 10:51:03 2011
[root@oracle11g~]# ps -ef | grep ora //当前后台没有与oracle相关的进程;
root 5514 5481 0 08:28 pts/0 00:00:00 grep ora
[root@oracle11g~]#
[root@oracle11g~]# su - oracle
[oracle@oracle11g~]$ env | grep ORACLE//启动所必需使用的环境变量;
ORACLE_SID=wilson
ORACLE_BASE=/u01
ORACLE_HOME=/u01/oracle
[oracle@oracle11g~]$
二、启动阶段操作
分阶段进行启动操作,之后我们查看效果。
1、nomount阶段
[oracle@oracle11g~]$ sqlplus /nolog
SQL*Plus: Release11.2.0.1.0 Production on Tue Apr 5 08:31:25 2011
Copyright (c) 1982, 2009, Oracle. All rights reserved.
SQL> conn / as sysdba;
Connected to an idle instance. //当前实例未启动
SQL> startup nomount
ORACLE instance started.//实例启动,共享内容空间分配!
Total System Global Area 414298112 bytes
Fixed Size 1336904 bytes
Variable Size 310380984 bytes
Database Buffers 96468992 bytes
Redo Buffers 6111232 bytes
此时,我们观察后台活动进程情况,可以看到。
[oracle@oracle11g~]$ ps -ef | grep ora
oracle 5594 1 0 08:32 ? 00:00:00 ora_pmon_wilson
oracle 5596 1 0 08:32 ? 00:00:00 ora_vktm_wilson
oracle 5600 1 0 08:32 ? 00:00:00 ora_gen0_wilson
oracle 5602 1 0 08:32 ? 00:00:00 ora_diag_wilson
oracle 5604 1 0 08:32 ? 00:00:00 ora_dbrm_wilson
oracle 5606 1 0 08:32 ? 00:00:00 ora_psp0_wilson
oracle 5608 1 0 08:32 ? 00:00:00 ora_dia0_wilson
oracle 5610 1 2 08:32 ? 00:00:00 ora_mman_wilson
oracle 5612 1 0 08:32 ? 00:00:00 ora_dbw0_wilson
oracle 5614 1 0 08:32 ? 00:00:00 ora_lgwr_wilson
oracle 5616 1 0 08:32 ? 00:00:00 ora_ckpt_wilson
oracle 5618 1 0 08:32 ? 00:00:00 ora_smon_wilson
oracle 5620 1 0 08:32 ? 00:00:00 ora_reco_wilson
oracle 5622 1 0 08:32 ? 00:00:00 ora_mmon_wilson
oracle 5624 1 0 08:32 ? 00:00:00 ora_mmnl_wilson
oracle 5626 1 0 08:32 ? 00:00:00 ora_d000_wilson
oracle 5628 1 0 08:32 ? 00:00:00 ora_s000_wilson
oracle 5630 1 0 08:32 ? 00:00:00 ora_s001_wilson
oracle 5632 5547 1 08:32 ? 00:00:00 oraclewilson (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
在nomount状态,虽然没有涉及到数据文件读取、控制文件读取等内容。但是最重要的数据库实例已经建立。核心后台进程PMON、DBWn、LGWr、SMON和CKPT已经建立,共享内存SGA区已经分配。
2、Mount状态
进入mount状态
SQL> alter database mount;
Database altered.
3、Open状态
进入Open状态
SQL> alter database open;
Database altered.
SQL> select count(*) from scott.emp;
COUNT(*)
----------
14
三、日志分析
根据系统参数可以获取到alert_log的路径,文件名默认为alert_<sid>.log,其中sid为数据库的唯一标志。在实验环境中,数据库名称为alert_wilson.log。
SQL> show parameter dump_dest;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
background_dump_dest string /u01/diag/rdbms/wilson/wilson/
trace
core_dump_dest string /u01/diag/rdbms/wilson/wilson/
cdump
user_dump_dest string /u01/diag/rdbms/wilson/wilson/
trace
[oracle@oracle11g~]$ date
Tue Apr 5 08:37:40 CST 2011
根据时间间隔进行整理日志,我们看到三个阶段数据库操作情况。
1、nomount阶段
nomount阶段进行的主要是Oracle实例的构建,共享内存的开启,以及初始化参数的读取定位。下面为日志相关部分片段。
Tue Apr 05 08:32:27 2011
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =27
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database11gEnterpriseEdition Release11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile /u01/oracle/dbs/spfilewilson.ora
System parameters with non-default values:
processes = 150
memory_target =396M
control_files = "/u01/oradata/WILSON/controlfile/o1_mf_6bcsqm50_.ctl"
control_files = "/u01/flash_recovery_area/WILSON/controlfile/o1_mf_6bcsqmd8_.ctl"
db_block_size = 8192
compatible = "11.2.0.0.0"
db_create_file_dest = "/u01/oradata"
db_recovery_file_dest = "/u01/flash_recovery_area"
db_recovery_file_dest_size=3852M
undo_tablespace = "UNDOTBS1"
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=wilsonXDB)"
shared_servers = 2
audit_file_dest = "/u01/admin/wilson/adump"
audit_trail = "DB"
db_name = "wilson"
open_cursors = 300
diagnostic_dest = "/u01"
Tue Apr 05 08:32:30 2011
PMON started with pid=2, OS id=5594
Tue Apr 05 08:32:30 2011
VKTM started with pid=3, OS id=5596 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Tue Apr 05 08:32:30 2011
GEN0 started with pid=4, OS id=5600
Tue Apr 05 08:32:30 2011
DIAG started with pid=5, OS id=5602
Tue Apr 05 08:32:30 2011
DBRM started with pid=6, OS id=5604
Tue Apr 05 08:32:31 2011
PSP0 started with pid=7, OS id=5606
Tue Apr 05 08:32:31 2011
DIA0 started with pid=8, OS id=5608
Tue Apr 05 08:32:31 2011
MMAN started with pid=9, OS id=5610
Tue Apr 05 08:32:31 2011
DBW0 started with pid=10, OS id=5612
Tue Apr 05 08:32:31 2011
CKPT started with pid=12, OS id=5616
Tue Apr 05 08:32:31 2011
LGWR started with pid=11, OS id=5614
Tue Apr 05 08:32:31 2011
SMON started with pid=13, OS id=5618
Tue Apr 05 08:32:31 2011
RECO started with pid=14, OS id=5620
Tue Apr 05 08:32:31 2011
MMON started with pid=15, OS id=5622
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Apr 05 08:32:31 2011
MMNL started with pid=16, OS id=5624
starting up 2 shared server(s) ...
ORACLE_BASE from environment = /u01
总结:在本阶段,Oracle忙于进行如下阶段顺序操作。
- 根据环境变量,确定初始化参数文件位置。根据环境变量中对ORACLE_SID、ORACLE_HOME等的设置,确定要启动数据库在物理磁盘上的位置。根据Oracle的OFS结构,来确定使用参数文件可能的位置。根据spfile<sid>.ora、spfile.ora、init<sid>.ora和init.ora的顺序来定位参数文件。
- 读取参数文件,确定基础的参数内容。通过读取参数文件,Oracle启动过程获取到该数据库的基本信息,包括控制文件位置、块大小、各种类型目录的具体位置、特殊表空间名称等内容。其中,对控制文件的定位,是进一步读取更多数据库信息,定位数据库文件的重要前置步骤;
2、mount阶段
在mount阶段,数据库日志的内容相对简单些了。
Tue Apr 05 08:33:09 2011
alter database mount
Tue Apr 05 08:33:13 2011
Successful mount of redo thread 1, with mount id 3868018373
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount
总结:mount阶段,Oracle主要进行的各类型文件(数据文件、日志文件)的定位寻找工作。通过在nomount阶段加载的控制文件信息,确定了当前系统的文件的具体位置已经SCN信息。在此阶段,可以进行数据恢复方面的一些工作。
3、Open阶段
Tue Apr 05 08:33:29 2011
alter database open
Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
read 18880 KB redo, 1853 data blocks need recovery
Started redo application at
Thread 1: logseq 87, block 4964
Recovery of Online Redo Log: Thread 1 Group 3 Seq 87 Reading mem 0
Mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_3_6bcsqtfj_.log
Mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_3_6bcsqtwv_.log
Completed redo application of 15.28MB
Completed crash recovery at
Thread 1: logseq 87, block 42725, scn 2965918
1853 data blocks read, 1853 data blocks written, 18880 redo k-bytes read
Tue Apr 05 08:33:33 2011
Thread 1 advanced to log sequence 88 (thread open)
Thread 1 opened at log sequence 88
Current log# 1 seq# 88 mem# 0: /u01/oradata/WILSON/onlinelog/o1_mf_1_6bcsqpty_.log
Current log# 1 seq# 88 mem# 1: /u01/flash_recovery_area/WILSON/onlinelog/o1_mf_1_6bcsqqt0_.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Apr 05 08:33:34 2011
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for11gtablespace encryption..
Verifying11gfile header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Tue Apr 05 08:33:38 2011
QMNC started with pid=21, OS id=5721
Tue Apr 05 08:33:40 2011
Completed: alter database open
总结:在进入Open阶段之后,Oracle进行了一系列的工作。
- 由于上次关闭数据库非正常关闭。Oracle会自动检查是否需要进行前滚、后滚的回复操作。此时,控制文件、日志文件以及各个数据文件上的检查点和SCN记录就是一个重要依据。在本段代码中,Oracle发现需要进行crash恢复,并且使用Group3、顺序号为87的重做日志进行恢复;
- 确定当前的Oracle Redo Log。进行Seq87的恢复后,确定Seq88作为当前使用的Redo Log记录;
- 其他组件进程启动;
应该说,Oracle的启动是一个相当复杂精密、且层层相关的过程。深入的进行alert_SID.log分析,可以帮助我们更好的理解Oracle内部机制和机理,进而提高我们使用数据库、解决数据库问题的能力。