Oracle10g RAC的一些服务(CRS),trc及log文件简介[final]


0. 先认识一下Oracle10g RAC的一些服务及概念  

Cluster Synchronization Services (CSS)—
Manages the cluster configuration by controlling which nodes are members of the
cluster and by notifying members when a node joins or leaves the cluster. If
you are using third-party clusterware, then the css process interfaces with your
clusterware to manage node membership information.

Cluster Ready Services (CRS)—
The primary program for managing high availability operations within a cluster.
Anything that the crs process manages is known as a cluster resource which could
be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an
application process, and so on. The crs process manages cluster resources based on
the resource's configuration information that is stored in the OCR. This includes
start, stop, monitor and failover operations. The crs process generates events when
a resource status changes. When you have installed Oracle RAC, crs monitors the Oracle
instance, Listener, and so on, and automatically restarts these components when a failure
occurs. By default, the crs process makes five attempts to restart a resource and then
does not make further restart attempts if the resource does not restart.

Event Management (EVM):
A background process that publishes events that crs creates.

Oracle Notification Service (ONS):
A publish and subscribe service for communicating Fast Application Notification
(FAN) events.

RACG—
Extends clusterware to support Oracle-specific requirements and complex resources.
Runs server callout scripts when FAN events occur.

Process Monitor Daemon (OPROCD):
This process is locked in memory to monitor the cluster and provide I/O fencing.
OPROCD performs its check, stops running, and if the wake up is beyond the expected
time, then OPROCD resets the processor and reboots the node. An OPROCD failure results
in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux
platforms.

Voting Disk :
Manages cluster membership by way of a health check and arbitrates cluster
ownership among the instances in case of network failures. Oracle RAC uses the
voting disk to determine which instances are members of a cluster. The voting disk
must reside on shared disk. For high availability, Oracle recommends that you have
multiple voting disks. The Oracle Clusterware enables multiple voting disks but you
must have an odd number of voting disks, such as three, five, and so on. If you define
a single voting disk, then you should use external mirroring to provide redundancy.

Oracle Cluster Registry (OCR):
Maintains cluster configuration information as well as configuration information
about any cluster database within the cluster. The OCR also manages information about
processes that Oracle Clusterware controls. The OCR stores configuration information
in a series of key-value pairs within a directory tree structure. The OCR must reside
on shared disk that is accessible by all of the nodes in your cluster. The Oracle
Clusterware can multiplex the OCR and Oracle recommends that you use this feature
to ensure cluster high availability. You can replace a failed OCR online, and you can
update the OCR through supported APIs such as Enterprise Manager, the Server Control
Utility (SRVCTL), or the Database Configuration Assistant (DBCA).

 

 

CRS主要service --

crs主要进程
(1)crsd  -  负责管理HA操作, 管理crs资源,如linstener,vip,ons,gsn等,由root用户管理、启动
(2)ocssd -  管理各节点的关系,用于节点间通信, 由oracle用户运行管理
(3)oprocd - 集群进程管理 —Process monitor for the cluster. 仅在没有使用vendor的集群软件状态下运行
(4)evmd -  事件检测进程,由oracle用户运行管理

相关log位置
$ORA_CRS_HOME/log/nodename/crsd
$ORA_CRS_HOME/crs/init
$ORA_CRS_HOME/css/log
$ORA_CRS_HOME/css/init
$ORA_CRS_HOME/evm/log
$ORA_CRS_HOME/evm/init
$ORA_CRS_HOME/srvm/log

 

 


1. 这里 ORACLE_BASE=/u01/product , ORACLE_HOME=/u01/product/oracle 

mxrac05$ls
adump  bdump  cdump  dpdump  hdump  pfile  udump  

 

 

A. adump 记录的是aud后缀的审计文件,记录SYS用户的登陆信息 。

Audit file /u01/product/admin/mxdell/adump/ora_24065.aud
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/product/oracle
System name:    Linux
Node name:      mxrac05
Release:        2.6.18-128.el5
Version:        #1 SMP Wed Dec 17 11:41:38 EST 2008
Machine:        x86_64
Instance name: mxdell5
Redo thread mounted by this instance: 5
Oracle process number: 54
Unix process pid: 24065, image: oracle@mxrac05

Mon Sep 27 14:09:34 2010
LENGTH : '153'
ACTION :[7] 'CONNECT'
DATABASE USER:[3] 'SYS'
PRIVILEGE :[6] 'SYSDBA'
CLIENT USER:[12] 'harrison.han'
CLIENT TERMINAL:[11] 'MXWS-004570'
STATUS:[1] '0'

 

 


B. bdump 记录的是所有后台进程相关的trace文件及各实例的alert log文件

比如其中 alert_mxdell1.log 表示记录RAC节点1实例 mxdell1 (实例名称)对应的
告警日志文件及对应后台进程的trc文件;其中还有一些目录比如cdmp_20101005101745
下有一些trw文件,也是一种trace文件,一般出现这种文件,都会在alert log中
找到对应的错误日志, 比如Tue Jul 13 22:01:16 2010 Trace dumping is performing
id=[cdmp_20100713220116], alert log中这些错误会生成含有时间戳的核心转储文件
bdump/cdmp_timestamp, 其中timestamp表示错误发生的时间,一般出现core dump基本
都是bug导致 。


The directory cdmp_timestamp contains in-memory traces of Oracle RAC instance
failure information

Diagnosability Daemon (DIAG)
The Diagnosability Daemon captures diagnostic information related to process and
instance failures.  This information can be used Oracle World Wide Support to help
and analyze and resolve problems with your database and instances.

The DIAG process writes its diagnostic information to files in a subdirectory of
the directory specified by the initialization parameter BACKGROUN_DUMP_DEST.The
subdirectories are named cdmp_timestamp, where timestatmp identifies when the
subdirectory, and trace information, was written.


例子: 

mxrac01$ls -alhrt
total 4.1M
drwxr-xr-x 9 oracle dba 4.0K Mar  2  2010 ..
drwxr-x--- 2 oracle dba  24K Sep 17 10:59 cdmp_20100917105859
drwxr-x--- 2 oracle dba  24K Oct  1 22:14 cdmp_20101001221411
drwxr-x--- 2 oracle dba  24K Oct  5 10:17 cdmp_20101005101745
-rw-rw---- 1 oracle dba 1.1K Nov 12 23:00 mxdell1_m001_2876.trc
-rw-rw---- 1 oracle dba 1.1K Nov 13 19:00 mxdell1_m001_8809.trc
-rw-rw---- 1 oracle dba 1.1K Nov 13 21:00 mxdell1_m001_28585.trc
-rw-rw---- 1 oracle dba  964 Nov 14 17:00 mxdell1_m001_15713.trc
-rw-rw---- 1 oracle dba  773 Nov 14 17:10 mxdell1_q002_19652.trc
-rw-rw---- 1 oracle dba  976 Nov 14 17:12 mxdell1_arc0_13125.trc
-rw-rw---- 1 oracle dba  68K Nov 14 17:29 mxdell1_diag_12986.trc
-rw-rw---- 1 oracle dba  984 Nov 14 18:00 mxdell1_m001_7499.trc
-rw-rw---- 1 oracle dba 747K Nov 14 18:51 mxdell1_arc1_13127.trc
-rw-rw---- 1 oracle dba 1.1K Nov 14 19:00 mxdell1_m001_1103.trc
-rw-rw---- 1 oracle dba 432K Nov 15 21:41 mxdell1_lmd0_12992.trc
-rw-rw---- 1 oracle dba 1.1K Nov 15 22:00 mxdell1_m001_2387.trc
-rw-rw---- 1 oracle dba 240K Nov 15 22:01 mxdell1_lms3_13006.trc
-rw-rw---- 1 oracle dba 291K Nov 15 22:12 mxdell1_lms4_13011.trc
-rw-rw---- 1 oracle dba 212K Nov 15 22:26 mxdell1_lms1_12998.trc
drwxr-xr-x 5 oracle dba  12K Nov 15 22:33 .
-rw-r----- 1 oracle dba 1.1M Nov 15 22:43 alert_mxdell1.log
-rw-rw---- 1 oracle dba 1.9K Nov 15 22:43 mxdell1_lgwr_13027.trc
-rw-rw---- 1 oracle dba 229K Nov 15 22:46 mxdell1_lms0_12994.trc
-rw-rw---- 1 oracle dba 239K Nov 15 22:47 mxdell1_lms5_13015.trc
-rw-rw---- 1 oracle dba 253K Nov 15 22:47 mxdell1_lms2_13002.trc

 

核心转储(core dump目录下的trw文件)例子 :

mxrac01$ls -alhrt 
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_smon_13801.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_reco_13803.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_qmnc_14006.trw
-rw-rw---- 1 oracle dba  32K Sep 17 10:59 mxdell1_pz99_14013.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_psp0_13754.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_pmon_13748.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_994.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9903.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9720.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9552.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9546.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9541.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9436.trw
-rw-rw---- 1 oracle dba  32K Sep 17 10:59 mxdell1_ora_9420.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_9200.trw 
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_10287.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_10090.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_ora_10068.trw
-rw-rw---- 1 oracle dba  38K Sep 17 10:59 mxdell1_mmon_13807.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_mmnl_13809.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_mman_13788.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lms5_13784.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lms4_13780.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lms3_13776.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lms2_13772.trw
-rw-rw---- 1 oracle dba  32K Sep 17 10:59 mxdell1_lms1_13766.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lms0_13760.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lmon_13756.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_lmd0_13758.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_lgwr_13797.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_lck0_13823.trw
-rw-rw---- 1 oracle dba  10K Sep 17 10:59 mxdell1_j005_4826.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_j002_7331.trw
-rw-rw---- 1 oracle dba  30K Sep 17 10:59 mxdell1_j001_10548.trw
-rw-rw---- 1 oracle dba  32K Sep 17 10:59 mxdell1_j000_10521.trw
-rw-rw---- 1 oracle dba 6.0K Sep 17 10:59 mxdell1_diag_13750.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_dbw2_13795.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_dbw1_13793.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_dbw0_13790.trw
-rw-rw---- 1 oracle dba  38K Sep 17 10:59 mxdell1_ckpt_13799.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_cjq0_13805.trw
-rw-rw---- 1 oracle dba  36K Sep 17 10:59 mxdell1_arc1_13937.trw
-rw-rw---- 1 oracle dba  34K Sep 17 10:59 mxdell1_arc0_13935.trw

 


mxrac01$crs_stat -t 
Name           Type           Target    State     Host       
------------------------------------------------------------
ora.mxdell.db  application    ONLINE    ONLINE    mxrac01    
ora....l1.inst application    ONLINE    ONLINE    mxrac01    
ora....l3.inst application    ONLINE    ONLINE    mxrac03    
ora....l4.inst application    ONLINE    ONLINE    mxrac04    
ora....l5.inst application    ONLINE    ONLINE    mxrac05    
ora....01.lsnr application    ONLINE    ONLINE    mxrac01    
ora....c01.gsd application    ONLINE    ONLINE    mxrac01    
ora....c01.ons application    ONLINE    ONLINE    mxrac01    
ora....c01.vip application    ONLINE    ONLINE    mxrac01    
ora....03.lsnr application    ONLINE    ONLINE    mxrac03    
ora....c03.gsd application    ONLINE    ONLINE    mxrac03    
ora....c03.ons application    ONLINE    ONLINE    mxrac03    
ora....c03.vip application    ONLINE    ONLINE    mxrac03    
ora....04.lsnr application    ONLINE    ONLINE    mxrac04    
ora....c04.gsd application    ONLINE    ONLINE    mxrac04    
ora....c04.ons application    ONLINE    ONLINE    mxrac04    
ora....c04.vip application    ONLINE    ONLINE    mxrac04    
ora....05.lsnr application    ONLINE    ONLINE    mxrac05    
ora....c05.gsd application    ONLINE    ONLINE    mxrac05    
ora....c05.ons application    ONLINE    OFFLINE              
ora....c05.vip application    ONLINE    ONLINE    mxrac05    
mxrac01$
mxrac01$
mxrac01$

 


C.  cdump 记录很多core_ 开头的目录,core文件是进程的内核映像,用户一般
不用看这些文件 。 core_ 后面的数字表示process ID .

cdump下存放的是oracle内部错误时的内核信息,在bdump或udump中都会有对应的文件。
cdump信息对oracle support很有用。修改参数 core_dump_dest 更改路径 。


mxrac01$ls -alhrt
total 60K
drwxr-x---  2 oracle dba 4.0K Dec  7  2009 core_2662
drwxr-x---  2 oracle dba 4.0K Dec 16  2009 core_20943
drwxr-x---  2 oracle dba 4.0K Dec 21  2009 core_27896
drwxr-x---  2 oracle dba 4.0K Dec 21  2009 core_23068
drwxr-x---  2 oracle dba 4.0K Dec 21  2009 core_21673
drwxr-x---  2 oracle dba 4.0K Dec 21  2009 core_2039
drwxr-x---  2 oracle dba 4.0K Dec 21  2009 core_11681
drwxr-x---  2 oracle dba 4.0K Jan 21  2010 core_18290
drwxr-x---  2 oracle dba 4.0K Jan 22  2010 core_4613
drwxr-x---  2 oracle dba 4.0K Jan 22  2010 core_18850
drwxr-x---  2 oracle dba 4.0K Jan 22  2010 core_5644
drwxr-x---  2 oracle dba 4.0K Feb 16  2010 core_15445
drwxr-xr-x  9 oracle dba 4.0K Mar  2  2010 ..
drwxr-x---  2 oracle dba 4.0K Aug  8 16:52 core_31833
drwxr-xr-x 15 oracle dba 4.0K Aug  8 16:52 .
mxrac01$

下面文件类似:

mxrac01$ls -alhrt
total 14M
-rw-------  1 oracle dba  16M Dec 21  2009 core.23068

打开这个文件可以看出是二进制文件 。


D.  dpdump :是存放一些登录信息的文件。

E.  hdump  很少会产生一些记录,表示Oracle High Availability Log Files 。

F.  udump :前台手动trace的, 比如sql trace之后session的trace文件

 

 

2.  CRS相关的服务log (mxrac01是节点1的hostname) . 

CRS 目录下的Log

admin => 记录一些概要信息
alertmxrac01.log =>记录节点crs状态变化时候的一些概要信息,详细还是要看css log
client =>记录crs初始化,ocr application including: CLSCFG, CSS, OCRCHECK, OCRCONFIG, OCRDUMP and OIFCFG
crsd =>记录crsd的相关日志,crs等待css进入fatal模式后,启动crsd然后启动相关的resource
cssd =>记录cssd的相关日志,节点停止,启动,reconfig等,所有问题都会记录,最重要的日志
evmd =>记录evmd的日志
racg =>记录ons,vip的相关日志

遇到问题一般先看ocssd.log,然后根据时间和需要会查看crsd的日志,所有资源相关的日志都在crsd.log,
另外如果日志看不出关键信息,可以把相关模块日志级别调高(不同版本默认log级别不太一样):

crsctl debug log css CSSD:5
crsctl debug log crs CRSD:3    等

这里每个模块相关的信息可以通过 crsctl lsmodule crs查看

例子: 

mxrac01$ls -alh
total 200K
drwxr-xr-x 45 root   dba  4.0K Nov 18  2009 .
drwxrwxr-x  6 oracle dba  4.0K Nov 19  2009 ..
drwxr-xr-x  2 root   dba  4.0K Feb 24  2010 bin
drwxrwxr-x  4 oracle dba  4.0K Nov 18  2009 cdata
drwxrwxr-x  5 oracle dba  4.0K Apr  2  2010 cfgtoollogs
...
drwxr-xr-x  4 oracle dba  4.0K Nov 18  2009 log
drwxrwx--- 10 oracle dba  4.0K Nov 18  2009 network
drwxrwx---  5 oracle dba  4.0K Nov 18  2009 nls
....
drwxrwx---  4 oracle dba  4.0K Nov 18  2009 xdk


mxrac01$ls
admin  alertmxrac01.log  client  crsd  cssd  evmd  racg


mxrac01$ls
crsd.log


mxrac01$ls
cssdOUT.log  mxrac01.pid  oclsmon  oclsomon  ocssd.l05  ocssd.log  ocssd.trc


mxrac01$ls
evmd.log  evmdOUT.log


mxrac01$ls -alh
total 104K
drwxrwxr-t 5 oracle dba  4.0K Nov 15 01:37 .
drwxr-xr-t 8 root   dba  4.0K Jan 24  2010 ..
-rw-r--r-- 1 oracle dba   494 Dec 13  2009 evtf.log
-rw-r--r-- 1 oracle dba  2.1K Dec 13  2009 ora.mxdell.db.log
-rw-r--r-- 1 oracle dba   56K Nov  7 02:21 ora.mxrac01.ons.log
-rw-r--r-- 1 root   root 4.0K Jun 20 19:04 ora.mxrac01.vip.log
-rw-r--r-- 1 root   root 2.4K Jun 20 19:04 ora.mxrac03.vip.log
-rw-r--r-- 1 root   root 1.5K Apr  2  2010 ora.mxrac04.vip.log
-rw-r--r-- 1 root   root  247 Apr  2  2010 ora.mxrac05.vip.log
drwxrwxrwt 2 oracle dba  4.0K Nov 18  2009 racgeut
drwxrwxrwt 2 oracle dba  4.0K Nov 18  2009 racgevtf
drwxrwxrwt 2 oracle dba  4.0K Nov 18  2009 racgmain

 

 

RACG --

mxrac01$ls
admin  alertmxrac01.log  client  crsd  cssd  evmd  racg

在RAC里有在CRS的日志目录里有一个子目录名字RACG, 在此目录下有关于ons,vip和gsd的一些日志

mxrac01$ls
evtf.log           ora.mxrac01.ons.log  ora.mxrac03.vip.log  ora.mxrac05.vip.log  racgevtf
ora.mxdell.db.log  ora.mxrac01.vip.log  ora.mxrac04.vip.log  racgeut              racgmain

Oracle文档的解释:
RACG—
Extends clusterware to support Oracle-specific requirements and complex resources.
Runs server callout scripts when FAN events occur.

 

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-678252/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/35489/viewspace-678252/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值