1.1 OHASD 启动的 orarootagent, cssdagent 和 oraagent。
1.2 CRSD 启动的 orarootagent 和 oraagent。
GI的启动
[root@node1 ~]#crsctl start crs -wait
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.mdnsd' on 'node1'
CRS-2672: Attempting to start 'ora.evmd' on 'node1'
CRS-2676: Start of 'ora.mdnsd' on 'node1' succeeded
CRS-2676: Start of 'ora.evmd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'node1'
CRS-2676: Start of 'ora.gpnpd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'node1'
CRS-2676: Start of 'ora.gipcd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'node1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'node1'
CRS-2676: Start of 'ora.cssdmonitor' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'node1'
CRS-2672: Attempting to start 'ora.diskmon' on 'node1'
CRS-2676: Start of 'ora.diskmon' on 'node1' succeeded
CRS-2676: Start of 'ora.crf' on 'node1' succeeded
CRS-2676: Start of 'ora.cssd' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'node1'
CRS-2672: Attempting to start 'ora.ctssd' on 'node1'
CRS-2676: Start of 'ora.ctssd' on 'node1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'node1'
CRS-2676: Start of 'ora.asm' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'node1'
CRS-2676: Start of 'ora.storage' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'node1'
CRS-2676: Start of 'ora.crsd' on 'node1' succeeded
CRS-6023: Starting Oracle Cluster Ready Services-managed resources
CRS-6017: Processing resource auto-start for servers: node1
CRS-2672: Attempting to start 'ora.ons' on 'node1'
CRS-2672: Attempting to start 'ora.chad' on 'node1'
CRS-2676: Start of 'ora.chad' on 'node1' succeeded
CRS-2676: Start of 'ora.ons' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'node1'
CRS-2676: Start of 'ora.DATA.dg' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.orcl.db' on 'node1'
CRS-2676: Start of 'ora.orcl.db' on 'node1' succeeded
CRS-2672: Attempting to start 'ora.orcl.orcltest.svc' on 'node1'
CRS-2676: Start of 'ora.orcl.orcltest.svc' on 'node1' succeeded
CRS-6016: Resource auto-start has completed for server node1
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.
另外从OS的角度我们也有个更不错的方法去区分这些代理进程是由谁启动的。
首先看到了CRS-4123
Starting Oracle High Availability Services-managed resources
CRS-2676: Start of ‘ora.cssd’ on ‘node1’ succeeded
CRS-2676: Start of ‘ora.crsd’ on ‘node1’ succeeded
启动CRSD所管理的资源
CRS-6023: Starting Oracle Cluster Ready Services-managed resources
GI的启动有两种途径,一种是伴随着OS的启动而自动启动GI,另外一种途径则是用crsctl start crs命令手动启动GI。但是不管哪一种途径启动GI,前提都需要有init.ohasd进程。
init.ohasd
/etc/systemd/system/oracle-ohasd.service linux7
/etc/init.d/init.ohasd linux6
/etc/inittab linux5
常驻OS的进程。即便集群被手动关闭,init.ohasd run进程应该是一直存在于OS中的
如果集群无法启动时,应该要先去检查一下init.ohasd进程是不是存在
S96ohasd
随着OS启动,GI也会自动启动。这实际上是由脚本S96ohasd实现的
需要OS的 runlevel 3 或者 5的时候才会随着OS的启动而调用这个脚本。所以在Linux7中,这个脚本存在于etc/rc.d/rc3.d和etc/rc.d/rc5.d中
[root@node1 ~]# locate S96ohasd
/etc/rc.d/rc3.d/S96ohasd
/etc/rc.d/rc5.d/S96ohasd
GI是否会随着OS的启动而自动启动
crsctl disable crs
调查GI无法启动的问题,查看OS日志是非常必要的。因为init.ohasd也好,s96ohasd也好,他们的很多信息都是打印到了OS日志中。
crsctl start crs -wait
crsctl check cluster
crsctl stat res -t可以确认到CRSD管理的资源状态,crsctl stat res -t -init则可以确认到OHASD管理的资源的状态
Name 资源名
Target 资源的期待状态
State 资源的实际的状态
Server 资源所在的节点
State details 资源的实际状态的详细
这里面Target显示的信息其实是GI记录的资源该有的状态。这个信息是记录在OCR中的(OCR是记录集群所有重要信息的文件,位于共享磁盘上,被所有节点访问)。
如果Target是ONLINE,而State是OFFLINE的话,代表该资源启动失败。如果Target是OFFLINE,则代表该资源不会被自动启动。如果用户想启动该资源,则需要执行相关的资源启动命令。
其实GI启动的知识远比上面的内容要多的多,比如OHASD的启动需要读取OLR的信息,CSSD的启动需要读取OCR的信息等等。
相关的进程
[root@node1 ~]# ps -ef | grep /u01/64bit/app/19.3.0/grid/bin/
cat /proc/进程号/environ | tr "\000" "\n" | grep __CLSAGENT_LOGDIR_NAME
[root@node1 ~]# cat /proc/24629/environ | tr "\000" "\n" | grep __CLSAGENT_LOGDIR_NAME
__CLSAGENT_LOGDIR_NAME=ohasd
[root@node1 ~]# cat /proc/27514/environ | tr "\000" "\n" | grep __CLSAGENT_LOGDIR_NAME
__CLSAGENT_LOGDIR_NAME=crsd
GI的日志
<Grid_Base>/diag/crs//crs/trace/