个人介绍
----------------------------------------------------------------------
Oracle & PG ACE
Oracle 11g OCM
TecTalk 技术交流社区创办者
OceanBase & IvorySQL 南京地区组织者
TiDB 布道者
个人公众号
----------------------------------------------------------------------
背景描述
因为一套 Oracle 19c RAC 生产库近期由于业务调整,导致存储空间近期暴涨,原有 ASM 存储空间紧张,甚至为了解决存储空间紧张问题,将归档日志也从原来的保留三天调整到仅保留一天。
近期向系统部门申请了存储扩容,为该库扩容 2T 存储。
系统部门同事为该库扩容了 2T 共享存储,通过数据库服务器 /var/log/messages 也看到了存储扩容加盘的相关信息:
# 节点一 /var/log/messages 日志信息
Sep 23 09:57:49 xxxx-xxx-oracle-db1 kernel: sd 15:0:1:16: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db1 kernel: sd 15:0:0:17: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db1 kernel: sd 16:0:0:17: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db1 kernel: sd 16:0:1:16: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
# 节点二 /var/log/messages 日志信息
Sep 23 09:57:49 xxxx-xxx-oracle-db2 kernel: sd 16:0:0:15: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db2 kernel: sd 15:0:0:15: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db2 kernel: sd 16:0:1:7: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
Sep 23 09:57:49 xxxx-xxx-oracle-db2 kernel: sd 15:0:1:14: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical
但目前 /dev 目录下还未看到添加的盘,虽然通过某些命令可以扫描并识别到盘,但为了更稳妥一些,还是决定逐台重启服务器以便识别到新加的盘。
首先通过 /u01/app/19.3.0/grid/bin/crsctl stop crs 命令关闭节点一 CRS 服务,CRS 停的很顺利,然后执行 reboot 重启服务器。
服务器重启后,等了将近10分钟,发现 CRS 服务并未自动拉起,观察该节点的 grid 和 oracle alert 日志,还停留在停服务时的信息。
# ASM alert 部分日志信息
......
NOTE: cache deleting context for group DG 2/0xd5d020d0
2024-09-25T21:01:19.083386+08:00
SUCCESS: ALTER DISKGROUP ALL DISMOUNT /* asm agent *//* {0:0:45422} */
Shutting down archive processes
Archiving is disabled
IM on ADG: Start of Empty Journal
IM on ADG: End of Empty Journal
JIT: pid 81222 requesting stop
Shutting down archive processes
Archiving is disabled
2024-09-25T21:01:19.169879+08:00
Stopping background process VKTM
2024-09-25T21:01:19.170205+08:00
JIT: pid 81222 requesting stop
2024-09-25T21:01:23.439365+08:00
freeing rdom 2
freeing the fusion rht of pdb 2
freeing rdom 1
freeing the fusion rht of pdb 1
freeing rdom 0
freeing the fusion rht of pdb 0
2024-09-25T21:01:29.361866+08:00
Instance shutdown complete (OS id: 81222)
# oracle alert 部分日志信息
......
2024-09-25T21:00:56.488974+08:00
XXXXXXX(4):KILL SESSION for sid=(469, 17539):
XXXXXXX(4): Reason = dbms_service.kill_session
XXXXXXX(4): Mode = KILL HARD FORCE -/-/-
XXXXXXX(4): Requestor = USER (orapid = 9, ospid = 11369, inst = 1)
XXXXXXX(4): Owner = Process: USER (orapid = 291, ospid = 97566)
XXXXXXX(4): Result = ORA-0
Shutting down ORACLE instance (abort) (OS id: 11347)
2024-09-25T21:00:56.493071+08:00
Shutdown is initiated by oraagent.bin@xxxx-xxx-oracle-db1 (TNS V1-V3.
License high water mark = 184
2024-09-25T21:00:56.493551+08:00
XXXXXXX(4):Process termination requested for pid 24229 [source = rdbms], [info = 2] [request issued by pid: 11369, uid: 600]
USER (ospid: 11347): terminating the instance
2024-09-25T21:00:56.495634+08:00
XXXXXXX(4):opiodr aborting process unknown ospid (95353) as a result of ORA-1092
2024-09-25T21:00:56.495778+08:00
XXXXXXX(4):opiodr aborting process unknown ospid (74997) as a result of ORA-1092
2024-09-25T21:00:56.496643+08:00
XXXXXXX(4):opiodr aborting process unknown ospid (84546) as a result of ORA-1092
2024-09-25T21:00:56.497097+08:00
XXXXXXX(4):opiodr aborting process unknown ospid (61218) as a result of ORA-1092
2024-09-25T21:00:56.497338+08:00
XXXXXXX(4):opiodr aborting process unknown ospid (23137) as a result of ORA-1092
2024-09-25T21:00:56.992794+08:00
ORA-1092 : opitsk aborting process
2024-09-25T21:00:57.280909+08:00
ORA-1092 : opitsk aborting process
2024-09-25T21:00:57.536540+08:00
Instance terminated by USER, pid = 11347
2024-09-25T21:00:57.647183+08:00
Warning: 3 processes are still attacheded to shmid 4:
(size: 53248 bytes, creator pid: 10070, last attach/detach pid: 11347)
Instance shutdown complete (OS id: 11347)
2024-09-25T21:42:26.135450+08:00
Starting ORACLE instance (normal) (OS id: 10187)
2024-09-25T21:42:26.367155+08:00
****************************************************
Sys-V shared memory will be used for creating SGA
****************************************************
2024-09-25T21:42:26.368298+08:00
**********************************************************************
2024-09-25T21:42:26.368377+08:00
Dump of system resources acquired for SHARED GLOBAL AREA (SGA)
到底什么原因导致 CRS 未自动拉起服务呢。
排查过程
首先通过 /u01/app/19.3.0/grid/bin/crsctl check crs 查看该节点 crs相关信息,结果如下所示:
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
如果集群服务正常,应该是如下信息显示:
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
通常出现上述 CRS 异常情况有如下几种原因:
-
CRS进程未完全启动:CRS进程可能未完全启动或由于某些错误而无法启动。
-
网络问题:RAC节点之间的网络通信问题,如私网或公共网络的配置错误或网络延迟过高,可能导致CRS无法正常通信。
-
Clusterware配置文件损坏:如果Clusterware配置文件(如OCR或VD files)损坏,可能导致CRS启动失败。
-
权限或资源问题:操作系统资源(如内存、文件描述符等)不够,会阻碍CRS进程的启动。
-
操作系统相关问题:OS参数配置不当或有相关补丁未打全可能导致此类问题。
然后尝试通过 crsctl start crs 进行手工启动,再此通过 crsctl check crs 检查仍然是上述信息。
通过 /u01/app/19.3.0/grid/bin/crsctl stat res -t 命令查看到只有节点二是正常的。
切换到 grid 用户,使用 kfod disks=all status=true 查看到的磁盘信息也都是正常的。
[grid@xxxx-xxx-oracle-db1 ~]$ kfod disks=all status=true
--------------------------------------------------------------------------------
Disk Size Header Path User Group
================================================================================
1: 8176 MB MEMBER /dev/oracleasm/disks/VOL01 grid asmadmin
2: 8176 MB MEMBER /dev/oracleasm/disks/VOL02 grid asmadmin
3: 8176 MB MEMBER /dev/oracleasm/disks/VOL03 grid asmadmin
4: 1638384 MB MEMBER /dev/oracleasm/disks/VOL04 grid asmadmin
5: 1638384 MB MEMBER /dev/oracleasm/disks/VOL05 grid asmadmin
6: 1638384 MB MEMBER /dev/oracleasm/disks/VOL06 grid asmadmin
7: 1638384 MB MEMBER /dev/oracleasm/disks/VOL07 grid asmadmin
8: 1638384 MB MEMBER /dev/oracleasm/disks/VOL08 grid asmadmin
9: 1638384 MB MEMBER /dev/oracleasm/disks/VOL09 grid asmadmin
10: 1638384 MB MEMBER /dev/oracleasm/disks/VOL10 grid asmadmin
11: 1638384 MB MEMBER /dev/oracleasm/disks/VOL11 grid asmadmin
12: 1535984 MB MEMBER /dev/oracleasm/disks/VOL12 grid asmadmin
13: 1535984 MB MEMBER /dev/oracleasm/disks/VOL13 grid asmadmin
14: 1535984 MB MEMBER /dev/oracleasm/disks/VOL14 grid asmadmin
15: 1535984 MB MEMBER /dev/oracleasm/disks/VOL15 grid asmadmin
16: 2047984 MB MEMBER /dev/oracleasm/disks/VOL16 grid asmadmin
17: 2047984 MB MEMBER /dev/oracleasm/disks/VOL17 grid asmadmin
18: 2047984 MB MEMBER /dev/oracleasm/disks/VOL18 grid asmadmin
19: 2047984 MB MEMBER /dev/oracleasm/disks/VOL19 grid asmadmin
--------------------------------------------------------------------------------
ORACLE_SID ORACLE_HOME
================================================================================
然后使用 grid 用户进入到 /u01/app/grid/diag/crs/trace/xxxx-xxx-oracle-db1 目录,查看 octssd.trc 信息,结果发现一个异常地方,如下日志所示:
2024-09-25 21:00:10.349 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:00:10.371 : CTSS:4110374656: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2024-09-25 21:00:17.286 :GIPCHTHR:4140300032: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 31050 loopCount 44
2024-09-25 21:00:27.886 :GIPCHTHR:4142401280: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30050 loopCount 40 sendCount 4 recvCount 11 postCount 4 sendCmplCount 4 recvCmplCount 4
2024-09-25 21:00:33.313 : CTSS:4146603776: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xcc], offset[0 ms]}, length=[8].
2024-09-25 21:00:40.382 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:00:40.382 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:00:40.382 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:00:40.382 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:00:40.402 : CTSS:4110374656: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2024-09-25 21:00:48.224 :GIPCHTHR:4140300032: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30940 loopCount 45
2024-09-25 21:00:57.928 :GIPCHTHR:4142401280: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30040 loopCount 39 sendCount 3 recvCount 10 postCount 3 sendCmplCount 3 recvCmplCount 3
2024-09-25 21:01:03.317 : CTSS:4146603776: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xcc], offset[0 ms]}, length=[8].
2024-09-25 21:01:10.413 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:01:10.413 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:01:10.413 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:01:10.413 : CTSS:4110374656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 21:01:10.434 : CTSS:4110374656: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2024-09-25 21:01:12.506 : CTSS:4106172160: ctsselect_vermon7: Expecting clssgsevGRPPRIV event. Ignoring 1 event.
2024-09-25 21:01:12.506 : CTSS:4106172160: ctsselect_vermon7: Expecting clssgsevGRPPRIV event. Ignoring 1 event.
2024-09-25 21:01:17.507 : CTSS:4106172160: ctsselect_vermon8: Received clssgsevGRPPRIV event.
2024-09-25 21:01:17.509 : CTSS:4106172160: ctsselect_vermon10_1: Retrieved av_data from grp pridata. Upgrade state [11].
2024-09-25 21:01:17.509 : CTSS:4106172160: ctsselect_vermon11: Retrieved Active Version [318767104].
2024-09-25 21:01:17.509 : CTSS:4106172160: ctsselect_vermon12: Active version[318767104] didn't change.
2024-09-25 21:01:19.161 :GIPCHTHR:4140300032: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30930 loopCount 45
2024-09-25 21:01:27.965 :GIPCHTHR:4142401280: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30040 loopCount 43 sendCount 4 recvCount 12 postCount 4 sendCmplCount 4 recvCmplCount 4
2024-09-25 21:01:32.241 : CTSS:4146603776: ctss_exitcb: CTSS daemon exiting.
2024-09-25 21:01:32.241 : CTSS:4146603776: ctss_exitcb: Exit requested by [default].
2024-09-25 21:01:32.241 : CTSS:4146603776: CTSS daemon exiting
Trace file /u01/app/grid/diag/crs/xxxx-xxx-oracle-db1/crs/trace/octssd.trc
Oracle Database 19c Clusterware Release 19.0.0.0.0 - Production
Version 19.18.0.0.0 Copyright 1996, 2023 Oracle. All rights reserved.
default:1285974656: 1: clskec:has:CLSU:910 4 args[clsdAdr_CLSK_err][mod=clsdadr.c][loc=(:CLSD00302:)][msg=2024-09-25 20:56:09.168 (:CLSD00302:) Trace file size and number of segments fetched from environment variable: ORA_DAEMON_TRACE_FILE_OPTIONS filesize=26214400,numsegments=10 Detected in function clsdAdrGetEnvVar_TFO at line number 6826. ]
CLSB:1285974656: [ INFO] Argument count (argc) for this daemon is 2
CLSB:1285974656: [ INFO] Argument 0 is: /u01/app/19.3.0/grid/bin/octssd.bin
CLSB:1285974656: [ INFO] Argument 1 is: reboot
2024-09-25 20:56:09.176 : CTSS:1285974656: tb_ctss_main invoked: 4294713610
2024-09-25 20:56:09.176 : CTSS:1285974656: Oracle Database CTSS Release 19.0.0.0.0 Production Copyright 2006, 2017 Oracle. All rights reserved.
2024-09-25 20:56:09.176 : CTSS:1285974656: ctss_main: The Cluster Time Synchronization Service is started with option [reboot].
2024-09-25 20:56:09.176 : CTSS:1285974656: tb_ctss_init: 0 (ms)
2024-09-25 20:56:09.176 : CTSS:1285974656: ctss_scls_init: SCLs Context is 0x55acdcdcf8a0
2024-09-25 20:56:09.197 : CLSDMT:1124288256: [ INFO] PID for the Process [8426], connkey CTSSD
2024-09-25 20:56:10.110 : CTSS:1285974656: tb_ctss_init: Initializing CSS: 930 (ms)
2024-09-25 20:56:10.110 : CTSS:1124288256: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x40], offset[0 ms]}, length=[8].
2024-09-25 20:56:10.115 : CTSS:1285974656: ctss_css_init: CSS Context is 0x55acdcdfaf50
2024-09-25 20:56:10.115 : CTSS:1285974656: tb_ctss_init: CSS initialized: 940 (ms)
2024-09-25 20:56:10.115 : CTSS:1285974656: ctss_init: CSS group_size= 2048
2024-09-25 20:56:10.115 : CTSS:1285974656: ctss_init: Cluster name: cluster-xxxx
2024-09-25 20:56:10.115 : CTSS:1285974656: ctss_init: CTSS production mode
2024-09-25 20:56:10.115 : CTSS:1285974656: ctss_init: Env var CTSS_REBOOT is undefined or contains non-boolean value. Ignoring CTSS_REBOOT.
2024-09-25 20:56:10.116 : CTSS:1285974656: tb_ctss_init: Getting vendor_sw: 940 (ms)
2024-09-25 20:56:10.125 : CTSS:1285974656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 20:56:10.125 : CTSS:1285974656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 20:56:10.125 : CTSS:1285974656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 20:56:10.125 : CTSS:1285974656: sclsctssvts_ivsr2: default pid file not found
2024-09-25 20:56:10.143 : CTSS:1285974656: ctss_check_vendor_sw: Vendor time sync software is not detected. status [1].
2024-09-25 20:56:10.143 : CTSS:1285974656: tb_ctss_init: Obtained vendor_sw: 960 (ms)
2024-09-25 20:56:10.143 : CTSS:1285974656: tb_ctss_init: Getting AV: 960 (ms)
2024-09-25 20:56:10.147 : default:1285974656: clsvactversion:4: Retrieving Active Version from local storage.
2024-09-25 20:56:10.152 : CTSS:1285974656: clsctss_r_av4: Current active version [19.0.0.0.0] [318767104].
2024-09-25 20:56:10.153 : CTSS:1285974656: tb_ctss_init: Obtained AV: 970 (ms)
2024-09-25 20:56:10.153 : CTSS:1285974656: tb_ctss_main: Initializing CCL: 970 (ms)
2024-09-25 20:56:10.154 : CRSCCL:1285974656: clsCclInit called by process 8426: groupname=CTSSGROUP commOptions=0 clusterType=0, traceLevel=0
default:1285974656: u_set_comp_error: comptype '103' : error '29'
2024-09-25 20:56:10.174 : CRSCCL:1285974656: Failed to init OCR to get active version. PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]Checking active version in OLR.
2024-09-25 20:56:10.176 : default:1285974656: clsvactversion:4: Retrieving Active Version from local storage.
2024-09-25 20:56:10.184 : CRSCCL:1285974656: USING GIPC ============
2024-09-25 20:56:10.185 : CRSCCL:1285974656: clsCclGipcListen: Attempting to listen on gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1.
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: active cipher list:
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_DHE_RSA_WITH_AES_256_GCM_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_RSA_WITH_AES_128_GCM_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_DHE_RSA_WITH_AES_256_CBC_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_RSA_WITH_AES_128_CBC_SHA256
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384
2024-09-25 20:56:10.238 : GIPCTLS:1285974656: gipclibTlsGetActiveCipherList: TLS_RSA_WITH_AES_128_CBC_SHA
2024-09-25 20:56:10.246 :GIPCHGEN:1285974656: gipchaInternalRegister: Initializing HA GIPC global flags 0x1
2024-09-25 20:56:10.255 : default:1285974656: clsvactversion:4: Retrieving Active Version from local storage.
2024-09-25 20:56:10.263 :GIPCHGEN:1285974656: gipchaInternalGetActiveSoftVersion: active version 19.0.0.0.0 [318767104] softwareVersion 19.0.0.0.0 [318767104]
2024-09-25 20:56:10.271 :GIPCHGEN:1285974656: gipchaNodeCreate: adding new node 0x55acdcf2b340 { host '', haName 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', srcLuid 7668191f-00000000, dstLuid 00000000-00000000 numInf 0, sentRegister 0, localMonitor 0, baseStream 0x55acdce9b860 type gipchaNodeType12001 (20), nodeIncarnation 00000000-00000000, incarnation 0, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x1}
2024-09-25 20:56:10.374 :GIPCHGEN:1285974656: gipchaGetClusterMode: gipchaGetClusterMode():REGULAR Cluster mode
2024-09-25 20:56:10.392 : GIPC:1285974656: gipchaInternalRegister: TLS flag 1
2024-09-25 20:56:10.393 :GIPCHTHR:1120085760: gipchaWorkerThread: starting worker thread hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 0, numInf 0, maxPriority 0, clientMode 0, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0x4c800 }
2024-09-25 20:56:10.395 :GIPCHDEM:1117984512: gipchaDaemonThread: starting daemon thread hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 0, numInf 0, maxPriority 0, clientMode 0, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0x4c800 }
2024-09-25 20:56:10.483 : GIPCLIB:1117984512: gipclibGetClusterGuid: retrieved cluster guid 764417f35eee4f81ffced8bfe6a27bd2
2024-09-25 20:56:10.487 :GIPCHDEM:1117984512: gipchaDaemonThread: successfully open the shared memory smem://xxxx-xxx-oracle-db1_gipcd1318_764417f35eee4f81ffced8bfe6a27bd2
2024-09-25 20:56:10.487 :GIPCHDEM:1117984512: gipchaDaemonConnect: connecting to daemon addr ipc://gipcd_xxxx-xxx-oracle-db1
2024-09-25 20:56:11.113 : CTSS:1124288256: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0x40], offset[0 ms]}, length=[8].
2024-09-25 20:56:11.564 :GIPCHDEM:1117984512: gipchaDaemonProcessInfUpdate: ip 10.90.0.161 subnet 10.90.0.0 mask 255.255.252.0 state 1 inc 0 flags 0x101
2024-09-25 20:56:11.564 :GIPCHGEN:1117984512: gipchaNodeAddInterfaceF [gipchaDaemonProcessInfUpdate : gipchaDaemonThread.c : 7476]: adding interface info 0x7f6a24053410 { host '', haName 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', local (nil), ip '10.90.0.161', subnet '10.90.0.0', mask '255.255.252.0', mac 'b4-05-5d-62-58-21', ifname 'team1', numRef 0, numFail 0, idxBoot 0, flags 0x1 }
2024-09-25 20:56:11.564 :GIPCHDEM:1117984512: gipchaDaemonProcessInfUpdate: completed interface update host 'xxxx-xxx-oracle-db1', haName '', hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 0, numInf 1, maxPriority 0, clientMode 0, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0x40805 }
2024-09-25 20:56:11.564 :GIPCHDEM:1117984512: gipchaDaemonProcessFailTransientInfs: failed transient interfaces (if any) for host xxxx-xxx-oracle-db1 haname xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c
2024-09-25 20:56:11.564 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x55acdd19c090 type gipchaClientReqTypePublish (1)
2024-09-25 20:56:11.564 :GIPCHTHR:1120085760: gipchaWorkerCreateInterface: created local interface for node 'xxxx-xxx-oracle-db1', haName 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', inf 'udp://10.90.0.161:64282' inf 0x7f6a24053410
2024-09-25 20:56:11.565 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a3003e790 type gipchaClientReqTypeInfPublish (7)
2024-09-25 20:56:11.606 : GIPCTLS:1285974656: gipcmodTlsGetWalletObjFromCred: using rootCredDomName: NULL, baseCredDomName:nodecert
2024-09-25 20:56:11.614 : GIPCTLS:1285974656: gipcmodTlsGetWalletObjFromCred: found base dom: nodecert
2024-09-25 20:56:11.621 : GIPCTLS:1285974656: gipcmodTlsGetWalletObjFromCred: found one certificate
2024-09-25 20:56:11.653 : GIPC:1285974656: gipclibTlsTraceCallback: function: nzosReadCertParams
2024-09-25 20:56:11.653 : GIPC:1285974656: sslconf.ora location is..
2024-09-25 20:56:11.653 : GIPC:1285974656: gipclibTlsTraceCallback: function: nzosReadCertParams
2024-09-25 20:56:11.653 : GIPC:1285974656: /u01/app/19.3.0/grid/ldap/admin/sslconf.ora
2024-09-25 20:56:11.653 : GIPC:1285974656: gipclibTlsTraceCallback: function: nzosReadConfFile
2024-09-25 20:56:11.653 : GIPC:1285974656: sslconf.ora file could not be opened. Continuing.
2024-09-25 20:56:11.670 : GIPCTLS:1285974656: gipcmodTlsGetWalletObjFromBuffer: using wallet buffer
2024-09-25 20:56:11.672 : GIPCTLS:1285974656: gipcmodTlsSetAuthFlags: nzcred auth (global 1) flags, feature: 32, optional:0
2024-09-25 20:56:11.674 : GIPCTLS:1285974656: gipcmodTlsSetCreds: [nzGlobalCtx]: certLen 1, gcertparamsLen 1
2024-09-25 20:56:11.675 : GIPCTLS:1285974656: gipcmodTlsAuthInit: server credentails set
2024-09-25 20:56:11.675 : GIPCTLS:1285974656: gipcmodTlsSetAuthFlags: nzcred auth (global 0) flags, feature: 32, optional:0
2024-09-25 20:56:11.675 : GIPCTLS:1285974656: gipcmodTlsAuthInit: tls context initialized successfully
2024-09-25 20:56:11.675 : CRSCCL:1285974656: gipcListen() Listening on gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1 endp=615
2024-09-25 20:56:11.690 : CRSCCL:1100080896: CSS Group Registration complete.
2024-09-25 20:56:11.697 : CRSCCL:1100080896: cclGetMemberData called, current mbrcount=2
2024-09-25 20:56:11.697 : CRSCCL:1100080896: Current map: 0x55acdce38498: : 1, 2
2024-09-25 20:56:11.697 : CRSCCL:1100080896: Previous map: 0x7f6a419123f8:
2024-09-25 20:56:11.700 : CRSCCL:1100080896: Member (1, 4294716110, xxxx-xxx-oracle-db1:19.0.0.0.0) @ found.
2024-09-25 20:56:11.708 : CRSCCL:1100080896: Member (2, 4294715690, xxxx-xxx-oracle-db2:19.0.0.0.0) @ found.
2024-09-25 20:56:11.711 : CRSCCL:1100080896: Obtained first membership map.
2024-09-25 20:56:11.711 : CRSCCL:1100080896: Dumping member data ------------------
2024-09-25 20:56:11.711 : CRSCCL:1100080896: Member (1, 4294716110) on node xxxx-xxx-oracle-db1 port=.
2024-09-25 20:56:11.711 : CRSCCL:1100080896: Member (2, 4294715690) on node xxxx-xxx-oracle-db2 port=.
2024-09-25 20:56:11.711 : CRSCCL:1100080896: Done ------------------
2024-09-25 20:56:11.711 : CRSCCL:1100080896: clsCclMembershipChangeCallback: Calling mem callback fn.with ctx.
2024-09-25 20:56:11.711 : CTSS:1100080896: ctsselect_msccb1: Receive membership change event. Inc num[23] New master [2] members count[2]
2024-09-25 20:56:11.711 : CTSS:1100080896: ctsselect_msccb2: Waiting for CCL initialization to complete. mode [0x40]
2024-09-25 20:56:11.711 : CTSS:1285974656: ctss_ccl_init4: clsCclInitWithCtx() finished. The local nodenum is [1].
2024-09-25 20:56:11.711 : CTSS:1285974656: ctss_configNodeForBC: CTSS is running on a hub node
2024-09-25 20:56:11.713 : CRSCCL:1094362880: cclCommunicationHandler started.
2024-09-25 20:56:11.718 : CTSS:1285974656: ctss_ccl_init6: Retrieved grpmap.
2024-09-25 20:56:11.718 : CTSS:1100080896: ctsselect_hr1: Receive membership change event for hub/master node. Inc num[23] New master [2] members count[2]
2024-09-25 20:56:11.718 : CTSS:1100080896: ctsselect_hr13: Host [xxxx-xxx-oracle-db2] Node num [2] is the CTSS master
2024-09-25 20:56:11.719 : CTSS:1285974656: ctss_ccl_init99: Successfully initialize CCL
2024-09-25 20:56:11.719 : CTSS:1285974656: tb_ctss_main: CCL initialized: 2540 (ms)
2024-09-25 20:56:11.719 : CTSS:1285974656: ctss_init: Spawn completed. Waiting for threads to join
2024-09-25 20:56:11.719 : CTSS:1285974656: tb_ctss_main: complete: 2540 (ms)
2024-09-25 20:56:11.720 : CRSCCL:1100080896: Waiting for reconfigs
CTSS:1090160384: clsctsselect_mm: Master Monitor thread started
2024-09-25 20:56:11.721 : CTSS:1090160384: tb_ctss_monitor_master_mode: 2540 (ms)
2024-09-25 20:56:11.721 : CRSCCL:1100080896: clsCclGetPriMemberData: Detected pridata change for node[1]. Retrieving it to the cache.
CTSS:1088059136: ctsselect_msm: Slave Monitor thread started
2024-09-25 20:56:11.722 : CTSS:1088059136: tb_ctss_monitor_slave_mode: 2540 (ms)
CTSS:1085957888: ctsselect_monitor_steysync_mode: Stepsync Monitor thread started. Executing step sync.
2024-09-25 20:56:11.722 : CTSS:1085957888: ctssslave_swm1_2: Ready to initiate new time sync process.
2024-09-25 20:56:11.722 : CRSCCL:1085957888: Dump Peer Conn (2, 4294715690): state=1 endp=0 nextMsgId=1 lastAckdId=0, conRetryCount=0,
2024-09-25 20:56:11.722 : CRSCCL:1085957888: Trying to establish connection with nodeid 2: gipcha://xxxx-xxx-oracle-db2:CTSSGROUP_2.
2024-09-25 20:56:11.759 :GIPCHGEN:1120085760: gipchaNodeCreate: adding new node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid 00000000-00000000 numInf 0, sentRegister 0, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 0, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x0}
2024-09-25 20:56:11.759 :GIPCHALO:1120085760: gipchaLowerProcessAcks: ESTABLISH finished for node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 0, sentRegister 0, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 0, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x8}
2024-09-25 20:56:11.759 :GIPCHALO:1120085760: gipchaLowerProcessMsgEstablish: TLS is switched ON for node 0x7f6a3003d520
2024-09-25 20:56:11.759 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a3003e790 type gipchaClientReqTypeMonitor (6)
2024-09-25 20:56:11.759 :GIPCHALO:1120085760: gipchaLowerProcessMsgEstablish: DIGEST is switched OFF for node 0x7f6a3003d520
2024-09-25 20:56:11.759 :GIPCHALO:1120085760: gipchaLowerProcessMsgEstablish: processed ESTABLISH from host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e',hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 1, numInf 1, maxPriority 0, clientMode 0, nodeIncarnation c5ef03ad-fffbe3a2 usrFlags 0x0, flags 0x40805 }, node 0x7f6a3003d520 hdr 0x7f6a30046618 { len 1448, seq 1, type gipchaHdrTypeRecvEstablish (5), lastSeq 0, lastAck 0, minAck 1, flags 0x11, srcLuid d03089ee-a82196f5, dstLuid 00000000-00000000, msgId 0 flags2 0x140014 offset 104 srcStreamId 00000000-00000000 dstStreamId 00000000-00000000 priority 0}
2024-09-25 20:56:11.759 :GIPCHDEM:1117984512: gipchaDaemonProcessClientMonitor: sent monitor request for req 0x7f6a3003e790, haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 1, numInf 1, maxPriority 0, clientMode 0, nodeIncarnation c5ef03ad-fffbe3a2 usrFlags 0x0, flags 0x40805 }
2024-09-25 20:56:11.759 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a3004baa0 type gipchaClientReqTypeMonitor (6)
2024-09-25 20:56:11.759 :GIPCHALO:1120085760: gipchaLowerSendEstablish: sending establish message for node '0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 0, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 0, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x800000c}'
2024-09-25 20:56:11.763 :GIPCHDEM:1117984512: gipchaDaemonProcessInfUpdate: ip 10.90.0.162:16506 subnet 10.90.0.0 mask 255.255.252.0 state 1 inc 2 flags 0x2
2024-09-25 20:56:11.763 :GIPCHDEM:1117984512: gipchaDaemonProcessInfUpdate: pickup team1 for remote interface [0]
2024-09-25 20:56:11.763 :GIPCHGEN:1117984512: gipchaNodeAddInterfaceF [gipchaDaemonProcessInfUpdate : gipchaDaemonThread.c : 7476]: adding interface info 0x7f6a24218530 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', local (nil), ip '10.90.0.162:16506', subnet '10.90.0.0', mask '255.255.252.0', mac '', ifname 'team1', numRef 0, numFail 0, idxBoot 0, flags 0x2 }
2024-09-25 20:56:11.763 :GIPCHDEM:1117984512: gipchaDaemonProcessInfUpdate: completed interface update host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', hctx 0x7f6a380773d0 [0000000000000011] { gipchaContext : host 'xxxx-xxx-oracle-db1', name 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', luid '7668191f-00000000', name2 xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c, numNode 1, numInf 1, maxPriority 0, clientMode 0, nodeIncarnation c5ef03ad-fffbe3a2 usrFlags 0x0, flags 0x40805 }
2024-09-25 20:56:11.763 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a2404d8a0 type gipchaClientReqTypeMonitor (6)
2024-09-25 20:56:11.763 :GIPCHDEM:1117984512: gipchaDaemonProcessFailTransientInfs: failed transient interfaces (if any) for host xxxx-xxx-oracle-db2 haname xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e
2024-09-25 20:56:11.763 :GIPCHTHR:1120085760: gipchaWorkerCreateInterface: created remote interface for node 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', inf 'udp://10.90.0.162:16506' inf 0x7f6a24218530
2024-09-25 20:56:11.763 :GIPCHGEN:1120085760: gipchaWorkerAttachInterface: Interface attached inf 0x7f6a24218530 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', local 0x7f6a24053410, ip '10.90.0.162:16506', subnet '10.90.0.0', mask '255.255.252.0', mac '', ifname 'team1', numRef 0, numFail 0, idxBoot 0, flags 0x6 }
2024-09-25 20:56:11.764 :GIPCHTHR:1120085760: gipchaWorkerBalanceInterfaces: bad metrics publihed for inf 0x7f6a24218530 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', local 0x7f6a24053410, ip '10.90.0.162:16506', subnet '10.90.0.0', mask '255.255.252.0', mac '', ifname 'team1', numRef 0, numFail 0, idxBoot 1, flags 0x4006 }, numSend 1 numRecv 1 numStamp 0
2024-09-25 20:56:11.764 :GIPCHTHR:1120085760: gipchaWorkerBalanceInterfaces: bad metrics publihed for inf 0x7f6a24053410 { host '', haName 'xxxx-xxx-oracle-db1-000020ea-6-00000000fffc254c', local (nil), ip '10.90.0.161:64282', subnet '10.90.0.0', mask '255.255.252.0', mac 'b4-05-5d-62-58-21', ifname 'team1', numRef 1, numFail 0, idxBoot 0, flags 0x400d }, numSend 1 numRecv 1 numStamp 0
2024-09-25 20:56:11.764 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a0c0895f0 type gipchaClientReqTypePublish (1)
2024-09-25 20:56:11.764 :GIPCHALO:1120085760: gipchaLowerRecv: Added new stream 0x7f6a300585d0 { srcStreamId 00000000-00000000 dstStreamId 00000000-02d641be , hendp (nil) haNode 0x7f6a3003d520 contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [0 : 0], priority 0, duplicate recv 0, completed recv 0, completed send 0, total send 0, total recv 0, flags 0x0} to node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 2, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x800080c}
2024-09-25 20:56:11.765 :GIPCHAUP:1120085760: gipchaUpperProcessAccept: completed new hastream 0x7f6a300585d0 { srcStreamId 00000000-00000b0f dstStreamId 00000000-02d641be , hendp (nil) haNode 0x7f6a3003d520 contigSeq 1, lastAck 0, lastValidAck 0, sendSeq [1 : 1], priority 0, duplicate recv 0, completed recv 0, completed send 0, total send 0, total recv 1, flags 0x1} for hendp 0x7f6a3003ddf0 [0000000000000b0f] { gipchaEndpoint : port 'CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', peer ':', srcCid 00000000-00000b0f, dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 }
2024-09-25 20:56:11.766 :GIPCHTHR:1120085760: gipchaWorkerProcessClientConnect: starting resolve from connect for host:xxxx-xxx-oracle-db2, port:CTSSGROUP_2, cookie:0x7f6a0c0895f0
2024-09-25 20:56:11.766 :GIPCHDEM:1117984512: gipchaDaemonProcessClientReq: processing req 0x7f6a3003e790 type gipchaClientReqTypeResolve (4)
2024-09-25 20:56:11.767 :GIPCHAUP:1120085760: gipchaUpperCallbackConnAck: completed CONNECT:ACK ret gipcretSuccess, hendp 0x55acdced2e90 [00000000000006ff] { gipchaEndpoint : port 'CTSSGROUP_1', peer ':', srcCid 00000000-00000000, dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 }
2024-09-25 20:56:11.770 :GIPCHDEM:1117984512: gipchaDaemonCreateResolveResponse: creating resolveResponse for host:xxxx-xxx-oracle-db2, port:CTSSGROUP_2, haname:xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e, ret:0
2024-09-25 20:56:11.785 :GIPCHAUP:1120085760: gipchaUpperConnect: initiated connect for umsg 0x7f6a3003da60 { msg 0x7f6a300557f0, ret gipcretRequestPending (15), flags 0x6 }, msg 0x7f6a300557f0 { type gipchaMsgTypeConnect (3), srcPort '6-0000-0000-0000-0001', dstPort 'CTSSGROUP_2', srcCid 00000000-00000b04, cookie 00007f6a-3003da60 } dataLen 0, endp 0x7f6a0c099d60 [0000000000000b04] { gipchaEndpoint : port '6-0000-0000-0000-0001', peer ':', srcCid 00000000-00000b04, dstCid 00000000-00000000, numSend 0, maxSend 100, groupListType 1, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x0 } node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 2, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags
2024-09-25 20:56:11.785 :GIPCHAUP:1120085760: gipchaUpperAccept: completed accept endp 0x7f6a3003ddf0 [0000000000000b0f] { gipchaEndpoint : port 'CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', peer 'xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', srcCid 00000000-00000b0f, dstCid 00000000-02d641be, numSend 0, maxSend 100, groupListType 1, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x4204 }
2024-09-25 20:56:11.785 : GIPCTLS:1094362880: gipcmodTlsSetAuthFlags: nzcred auth (global 0) flags, feature: 32, optional:0
2024-09-25 20:56:11.785 : GIPCTLS:1094362880: gipcmodTlsAuthInit: tls context initialized successfully
2024-09-25 20:56:11.786 :GIPCHAUP:1120085760: gipchaUpperProcessConnectAck: CONNACK completed umsg 0x7f6a3003da60 { msg 0x7f6a300557f0, ret gipcretSuccess (0), flags 0xe }, msg 0x7f6a300557f0 { type gipchaMsgTypeConnect (3), srcPort '6-0000-0000-0000-0001', dstPort 'CTSSGROUP_2', srcCid 00000000-00000b04, cookie 00007f6a-3003da60 } dataLen 0, hendp 0x7f6a0c099d60 [0000000000000b04] { gipchaEndpoint : port '6-0000-0000-0000-0001', peer 'xxxx-xxx-oracle-db2:CTSSGROUP_2/18d2-6ece-d4b5-18ca', srcCid 00000000-00000b04, dstCid 00000000-02d6420e, numSend 0, maxSend 100, groupListType 1, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x204 } node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 2, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSee
2024-09-25 20:56:11.792 :GIPCGMOD:1094362880: gipcmodGipcCompleteConnect: [gipc] completed connect on endp 0x7f6a0c03ac00 [0000000000000970] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:6-0000-0000-0000-0001', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:CTSSGROUP_2/18d2-6ece-d4b5-18ca', numPend 1, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7f6a0c08e580, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x20020 }
2024-09-25 20:56:11.792 : GIPCTLS:1094362880: gipcmodTlsSetAuthFlags: nzcred auth (global 1) flags, feature: 32, optional:0
2024-09-25 20:56:11.795 : GIPCTLS:1094362880: gipcmodTlsSetCreds: [nzGlobalCtx]: certLen 1, gcertparamsLen 1
2024-09-25 20:56:11.796 : GIPCTLS:1094362880: gipcmodTlsAuthInit: client credentails set
2024-09-25 20:56:11.796 : GIPCTLS:1094362880: gipcmodTlsSetAuthFlags: nzcred auth (global 0) flags, feature: 32, optional:0
2024-09-25 20:56:11.796 : GIPCTLS:1094362880: gipcmodTlsAuthInit: tls context initialized successfully
2024-09-25 20:56:11.797 : CRSCCL:1094362880: clsCclGipcAccept: connection accepted from pEndp->endp= 0xb24
2024-09-25 20:56:11.797 : CRSCCL:1094362880: clsCclNewConn: added new conn to tempConList: newPeerCon = 1c0559d0
2024-09-25 20:56:11.823 : GIPCTLS:1094362880: gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL
2024-09-25 20:56:11.823 : GIPCTLS:1094362880: gipcmodTlsAuthStart: Peer is anonymous
2024-09-25 20:56:11.823 : GIPCTLS:1094362880: gipcmodTlsAuthStart: endpoint 0x7f6a3003db90 [0000000000000b24] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7f6a3004b770, sendp (nil) status 0flags 0x20138606, flags-2 0x10, usrFlags 0x20 }, auth state: gipcmodTlsAuthStateReady (3)
2024-09-25 20:56:11.823 : GIPCTLS:1094362880: gipcmodTlsAuthReady: TLS Auth completed Successfully
2024-09-25 20:56:11.824 : GIPCTLS:1094362880: gipcmodTlsAuthStart: TLS HANDSHAKE - SUCCESSFUL
2024-09-25 20:56:11.825 : GIPCTLS:1094362880: gipcmodTlsAuthStart: peerUser: NULL
2024-09-25 20:56:11.825 : GIPCTLS:1094362880: gipcmodTlsAuthStart: name:CN=764417f35eee4f81ffced8bfe6a27bd2_4294691760,O=Oracle Clusterware,
2024-09-25 20:56:11.825 : GIPCTLS:1094362880: gipcmodTlsAuthStart: name:CN=764417f35eee4f81ffced8bfe6a27bd2_1644983441,O=Oracle_Clusterware,
2024-09-25 20:56:11.825 : GIPCTLS:1094362880: gipcmodTlsAuthStart: endpoint 0x7f6a0c03ac00 [0000000000000970] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:6-0000-0000-0000-0001', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:CTSSGROUP_2/18d2-6ece-d4b5-18ca', numPend 2, numReady 0, numDone 1, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7f6a0c08e580, sendp (nil) status 13flags 0x200b8602, flags-2 0x10, usrFlags 0x20020 }, auth state: gipcmodTlsAuthStateReady (3)
2024-09-25 20:56:11.825 : GIPCTLS:1094362880: gipcmodTlsAuthReady: TLS Auth completed Successfully
2024-09-25 20:56:11.827 : CRSCCL:1094362880: PNC: have conn to peer.
2024-09-25 20:56:11.827 : CRSCCL:1094362880: PNC: Disconnecting conn from node (2,4294715690).
2024-09-25 20:56:11.827 : CRSCCL:1094362880: clsCclGipcDisconnect: called with pEndp->endp=2852
2024-09-25 20:56:11.827 : CRSCCL:1094362880: clsCclGipcDisconnect: calling gipcDisconnect with pEndp->endp= 0xb24
2024-09-25 20:56:11.827 : GIPCTLS:1094362880: gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7f6a3003db90 [0000000000000b24] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', numPend 1, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 8220, readyRef (nil), ready 0, wobj 0x7f6a3004b770, sendp (nil) status 0flags 0x20138606, flags-2 0x50, usrFlags 0x20 }
2024-09-25 20:56:11.827 :GIPCGMOD:1094362880: gipcmodGipcDisconnect: [gipc] Issued endpoint close for endp 0x7f6a3003db90 [0000000000000b24] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', numPend 1, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 8220, readyRef (nil), ready 0, wobj 0x7f6a3004b770, sendp (nil) status 0flags 0x20138606, flags-2 0x50, usrFlags 0x20 }
2024-09-25 20:56:11.827 : CRSCCL:1094362880: clsCclGipcDisconnect: Disconnection successful, going to destroy pEndp->endp=2852
2024-09-25 20:56:11.827 : GIPCTLS:1094362880: gipcmodTlsDisconnect: [tls] disconnect issued on endp 0x7f6a3003db90 [0000000000000b24] { gipcEndpoint : localAddr 'gipcha://xxxx-xxx-oracle-db1:CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', remoteAddr 'gipcha://xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', numPend 1, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 8220, readyRef (nil), ready 0, wobj 0x7f6a3004b770, sendp (nil) status 0flags 0x2613860e, flags-2 0x50, usrFlags 0x20 }
2024-09-25 20:56:11.827 : CRSCCL:1094362880: PNC: Keeping our connection to node (2,4294715690).
2024-09-25 20:56:11.828 :GIPCHAUP:1120085760: gipchaUpperDisconnect: initiated discconnect umsg 0x7f6a30041710 { msg 0x7f6a300435d0, ret gipcretRequestPending (15), flags 0x2 }, msg 0x7f6a300435d0 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00000b0f, dstCid 00000000-02d641be } dataLen 0, endp 0x7f6a3003ddf0 [0000000000000b0f] { gipchaEndpoint : port 'CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', peer 'xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', srcCid 00000000-00000b0f, dstCid 00000000-02d641be, numSend 0, maxSend 100, groupListType 2, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x421c }
2024-09-25 20:56:11.829 :GIPCGMOD:1120085760: gipcmodGipcCallbackEndpClosed: [gipc] Endpoint close for endp 0x7f6a3003db90 [0000000000000b24] { gipcEndpoint : localAddr '(dying)', remoteAddr '(dying)', numPend 0, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x2, pidPeer 8220, readyRef (nil), ready 0, wobj 0x7f6a3004b770, sendp (nil) status 0flags 0x2693860e, flags-2 0x50, usrFlags 0x20 }
2024-09-25 20:56:11.829 :GIPCXCPT:1120085760: gipchaUpperProcessDisconnect: dropping Disconnect to unknown msg 0x7f6a30046640 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-02d641be, dstCid 00000000-00000b0f }, node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 2, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x800080c}, ret gipcretFail (1)
2024-09-25 20:56:11.829 :GIPCHAUP:1120085760: gipchaUpperProcessDisconnect: EXCEPTION[ ret gipcretFail (1) ] error during DISCONNECT processing for node 0x7f6a3003d520 { host 'xxxx-xxx-oracle-db2', haName 'xxxx-xxx-oracle-db2-0000201c-6-00000000fffc239e', srcLuid 7668191f-9d3e2ec4, dstLuid d03089ee-a82196f5 numInf 1, sentRegister 1, localMonitor 0, baseStream 0x7f6a3003cee0 type gipchaNodeType12001 (20), nodeIncarnation fb8751cc-fffbde34, incarnation 2, cssIncarnation 0, negDigest 4294967295, roundTripTime 4294967295 lastSeenPingAck 0 nextPingId 1 latencySrc 0 latencyDst 0 flags 0x800080c}
2024-09-25 20:56:11.830 : CRSCCL:1094362880: ConnAccepted from Peer:msgTag= 0xcccccccc version= 0 msgType= 4 msgId= 0 msglen = 0 clschdr.size_clscmsgh= 88 src= (2, 4294715690) dest= (1, 4294716110)
2024-09-25 20:56:11.830 : CRSCCL:1094362880: Dump Peer Conn (2, 4294715690): state=5 endp=2416 nextMsgId=1 lastAckdId=0, conRetryCount=1,
2024-09-25 20:56:11.830 :GIPCHAUP:1120085760: gipchaUpperCallbackDisconnect: completed DISCONNECT ret gipcretSuccess (0), umsg 0x7f6a30041710 { msg 0x7f6a300435d0, ret gipcretSuccess (0), flags 0x2 }, msg 0x7f6a300435d0 { type gipchaMsgTypeDisconnect (5), srcCid 00000000-00000b0f, dstCid 00000000-02d641be } dataLen 0, hendp 0x7f6a3003ddf0 [0000000000000b0f] { gipchaEndpoint : port 'CTSSGROUP_1/2ab4-e0d0-e4e7-0b45', peer 'xxxx-xxx-oracle-db2:6-0000-0000-0000-0004', srcCid 00000000-00000b0f, dstCid 00000000-02d641be, numSend 0, maxSend 100, groupListType 2, hagroup 0x55acdcea1860, priority 0, forceAckCount 0, usrFlags 0x4000, flags 0x421c }
2024-09-25 20:56:11.831 : CTSS:1085957888: ctssslave_swm2_1: Waiting for time sync message from master. sync_state[2].
2024-09-25 20:56:11.832 : CTSS:1092261632: ctsscomm_recv_cb4_2: Receive active version change msg. Old active version [318767104] New active version [318767104].
2024-09-25 20:56:11.833 : CTSS:1092261632: ctssslave_msg_handler4_1: Waiting for slave_sync_with_master to finish sync process. sync_state[3].
2024-09-25 20:56:11.833 : CTSS:1085957888: ctssslave_swm2_3: Received time sync message from master.
2024-09-25 20:56:11.833 : CTSS:1085957888: ctssslave_swm: The magnitude [659421214] of the offset [-659421214 usec] is larger than [600000000 usec] sec which is the CTSS limit. Hence CTSS is exiting.
2024-09-25 20:56:11.833 : CTSS:1085957888: (:ctsselect_mstm4:): Failed in clsctssslave_sync_with_master [12]: Time offset is too much to be corrected
2024-09-25 20:56:11.833 : CTSS:1092261632: ctssslave_msg_handler4_3: slave_sync_with_master finished sync process. Exiting clsctssslave_msg_handler
2024-09-25 20:56:12.114 : CTSS:1124288256: ctss_checkcb: clsdm requested check alive. checkcb_data{mode[0xd0], offset[-659421 ms]}, length=[8].
2024-09-25 20:56:12.114 : CTSS:1085957888: ctsselect_monitor_steysync_mode: CTSS daemon exiting [12].
2024-09-25 20:56:12.114 : CTSS:1085957888: CTSS daemon aborting
Trace file /u01/app/grid/diag/crs/xxxx-xxx-oracle-db1/crs/trace/octssd.trc
Oracle Database 19c Clusterware Release 19.0.0.0.0 - Production
Version 19.18.0.0.0 Copyright 1996, 2023 Oracle. All rights reserved.
default:2152102528: 1: clskec:has:CLSU:910 4 args[clsdAdr_CLSK_err][mod=clsdadr.c][loc=(:CLSD00302:)][msg=2024-09-25 21:41:34.504 (:CLSD00302:) Trace file size and number of segments fetched from environment variable: ORA_DAEMON_TRACE_FILE_OPTIONS filesize=26214400,numsegments=10 Detected in function clsdAdrGetEnvVar_TFO at line number 6826. ]
CLSB:2152102528: [ INFO] Argument count (argc) for this daemon is 2
CLSB:2152102528: [ INFO] Argument 0 is: /u01/app/19.3.0/grid/bin/octssd.bin
CLSB:2152102528: [ INFO] Argument 1 is: reboot
2024-09-25 21:41:34.510 : CTSS:2152102528: tb_ctss_main invoked: 4294713950
2024-09-25 21:41:34.510 : CTSS:2152102528: Oracle Database CTSS Release 19.0.0.0.0 Production Copyright 2006, 2017 Oracle. All rights reserved.
......
2024-09-25 21:41:34.528 : CLSDMT:1990416128: [ INFO] PID for the Process [8299], connkey CTSSD
我重启 CRS 的时候,是晚上 21:01左右,但后面突然时间变成了 20:56 分,后面的时间比前面的时间还早,可以判断是系统时间出现了异常。
应该是服务器的硬件时钟发生了异常:
如日志里提示:CTSS daemon exiting: The magnitude of the offset is too much to be corrected:
这条日志显示 CTSS 守护进程检测到时间偏差过大(超过 600,000 毫秒,即 600 秒),导致 CTSS 无法继续同步并退出。这通常是由于集群节点之间的时间严重不同步引起的,可能由未运行时间同步服务或 CTSS 自身无法正常同步引起。
一个原因可能是由于硬件时钟异常导致服务器和存储的时间出现差异,导致数据库进程无法正常启动。
于是我执行了 hwclock -w 同步下硬件时钟,为防止另一个节点也出现这情况,我也在另一个节点同步了硬件时钟。
然后重新重启了节点一服务器,这次重启后,等待了没多久,就看到 grid 和 oracle 的 alert 日志已经正常启动了。
然后使用 /u01/app/19.3.0/grid/bin/crsctl check crs 查看到 CRS 服务已经正常启动了。
然后 /dev/mapper下也正常看到了新加的存储盘信息,在经过一些操作后也将其正常加入到了 ASM 磁盘组里。
为了防止这次因为硬件时钟导致 CRS 无法正常启动,我将 hwclock -w 也加入到了时钟同步的定时任务里。