最近在一个新的exadata上碰到了这样一个错误,就是无法归档
在alert log里面发现如下的错误
2023-04-03T00:45:05.947884-04:00
ORACLE Instance cdb12 - Cannot allocate log, archival required
–ATTENTION–
Thread 2 cannot allocate new log, sequence 142. All online logs need archiving. Examine archive trace files for archiving errors.
2023-04-03T00:45:05.985236-04:00
Current log# 18 seq# 141 mem# 0: +D001/CDB1/ONLINELOG/group_18.288.1133099053
2023-04-03T00:45:09.705851-04:00
然后检查相应的arch文件发现下面的错误
*** 2023-04-03T01:42:12.037724-04:00 (CDB$ROOT(1))
OSSIPC:SKGXP:[1f6eba00.13]{0}: RDSMRDIAG closing rgn 0x1f703470 cell=192.168.0.19 key 1620489823 ldiscon=59139 finvalcon=0 saved=0 (r 0) missed=1 evt=0 idle=14038 (ts=1680500532037712)
Reconnect: Client timeout for cell 192.168.0.19 (0x1f7017e0) is 90000, global timeout is 4294967295
Device Reopen async completion request 0x1f742200, device handle 0x1f75dd20, error code 0 device: recov_CD_00_nshqae01celadm06
Device Reopen async completion request 0x1f742200, device handle 0x1f76edd0, error code 0 device: recov_CD_10_nshqae01celadm06
Device Reopen async completion request 0x1f742200, device handle 0x1f770f50, error code 0 device: recov_CD_01_nshqae01celadm06
后来发现就是cell node上的cellinit.ora和compute node上的cellinit.ora文件里的子网掩码不一样。一个使用的是20掩码,一个使用的是24掩码。
将掩码改为一致之后,问题就解决了。
比较困惑的是,原来如果掩码不一致的话,在compute node上根本就发现不了磁盘,也就根本安装不了GI,所以也不会到归档报错的这一步。
[Mon Apr 03 01:48:08][132129][root@nshqae01adm02:/etc/oracle/cell/network-config][0]# cat cellinit.ora
ipaddress1=192.168.0.3/20
ipaddress2=192.168.0.4/20
[root@nshqae01celadm01 config]# pwd
/opt/oracle/cell23.1.0.0.0_LINUX.X64_230225.1/cellsrv/deploy/config
[root@nshqae01celadm01 config]# cat cellinit.ora
#CELL Initialization Parameters
#ipaddress2=192.168.0.10/24
#ipaddress1=192.168.0.9/24
ipaddress2=192.168.0.10/20
ipaddress1=192.168.0.9/20