Oracle归档日志所在目录时间不对&&Oracle集群日志时间显示错误
Oracle归档日志所在目录时间不对&&Oracle集群日志时间显示错误
前言
这个问题在18年的时候遇到了,基本不注意并且集群或者数据库运行正常是很难注意到的。
忘记当时怎么发现的了,最近拿出来重现一下问题。
具体描述一下问题:
比如当前时间是2020年07月03日0点,数据库对redo log发起归档。
在OMF下,归档会按照日期进行分类归档。
但是发起归档之后,新的归档文件会放在的2020_07_02的目录中。
另外,db和asm和集群的警告日志的时间也和操作系统的date时间不一致。
现象
版本:11.2.0.4.0
在我的测试环境中,关闭所有节点的集群之后,更改时间:
[root@rac1 ~]# date -s "2020-07-03 00:00:00"Fri Jul3 00:00:00 CST 2020
然后,启动所有节点的集群,可以依次观察
集群的警告日志,grid用户下,$ORACLE_HOME/log/rac1/alertrac1.log
2020-07-02 09:00:26.667:
[ohasd(24600)]CRS-2112:The OLR service started on node rac1.2020-07-02 09:00:26.675:
[ohasd(24600)]CRS-1301:Oracle High Availability Service started on node rac1.2020-07-02 09:00:26.676:
[ohasd(24600)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0errors occurred2020-07-02 09:00:30.080:
[/u01/app/11.2.0/grid/bin/orarootagent.bin(24643)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).2020-07-02 09:00:34.563:
[gpnpd(24737)]CRS-2328:GPNPD started on node rac1.2020-07-02 09:00:36.992:
[cssd(24806)]CRS-1713:CSSD daemon is started inclustered mode2020-07-02 09:00:38.798:
[ohasd(24600)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon'as its target state is OFFLINE2020-07-02 09:01:01.319:
[cssd(24806)]CRS-1707:Lease acquisition for node rac1 number 1completed2020-07-02 09:01:02.612:
[cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr02; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.2020-07-02 09:01:02.620:
[cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr03; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.2020-07-02 09:01:02.623:
[cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr01; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.2020-07-02 09:01:15.917:
[cssd(24806)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .2020-07-02 09:01:18.061:
[ctssd(25418)]CRS-2403:The Cluster Time Synchronization Service on host rac1 is inobserver mode.2020-07-02 09:01:18.421:
[ctssd(25418)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac1.2020-07-02 09:01:18.422:
[ctssd(25418)]CRS-2401:The Cluster Time Synchronization Service started on host rac1.2020-07-02 09:01:20.023:
[ohasd(24600)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon'as its target state is OFFLINE
[client(25475)]CRS-10001:02-Jul-20 09:01 ACFS-9391: Checking for existing ADVM/ACFS installation.
[client(25480)]CRS-10001:02-Jul-20 09:01 ACFS-9392: Validating ADVM/ACFS installation files foroperating system.
[client(25482)]CRS-10001:02-Jul-20 09:01 ACFS-9393: Verifying ASM Administrator setup.
[client(25485)]CRS-10001:02-Jul-20 09:01 ACFS-9308: Loading installed ADVM/ACFS drivers.
[client(25488)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleoks.ko'driver.
[client(25521)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleadvm.ko'driver.
[client(25548)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleacfs.ko' driver.
........
ASM实例警告日志,
Thu Jul 02 09:01:26 2020NOTE: No asm libraries foundinthe system
NOTE: No asm libraries foundinthe system
MEMORY_TARGET defaulting to1128267776.* instance_number obtained from CSS = 1, checking for the existence of node 0...* node 0 does not exist. instance_number = 1Starting ORACLE instance (normal)
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the/dev/shm file system to be mounted for at least 1140850688 bytes. /dev/shm is either not mounted or is mounted with available space less than this size. Please fix this so that MEMORY_TARGET can work as expected. Current available is 976789504 and used is 73728 bytes. Ensure that the mount point is /dev/shm forthis directory.
LICENSE_MAX_SESSION= 0LICENSE_SESSIONS_WARNING= 0Initial number of CPU is4Number of processor coresin the system is 4Number of processor socketsin the system is 2Private Interface'eth1:1' configured from GPnP foruse as a private interconnect.
[name='eth1:1', type=1, ip=169.254.146.181, mac=00-0c-29-60-50-8b, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface'eth0' configured from GPnP foruse as a public interface.
[name='eth0', type=1, ip=192.168.180.100, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 3Using LOG_ARCHIVE_DEST_1 parameter default value as/u01/app/11.2.0/grid/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS= 0SYS auditing is disabled
NOTE: Volume support enabled
Starting up:
Oracle Database 11g Enterprise Edition Release11.2.0.4.0 -64bit Production
With the Real Application Clusters and Automatic Storage Management options.
ORACLE_HOME= /u01/app/11.2.0/grid
System name: Linux
Node name: rac1.example.com
Release:2.6.32-573.el6.x86_64
Version: #1 SMP Wed Jul 1 18:23:37 EDT 2015Machine: x86_64
VM name: VMWare Version:6Using parameter settingsin server-side spfile +OCR/rac-cluster/asmparameterfile/registry.253.1034884609
DB的警告日志,
Thu Jul 02 09:02:47 2020Adjusting the default value of parameter parallel_max_servers
from160 to 120 due to the value of parameter processes (150)
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION= 0LICENSE_SESSIONS_WARNING= 0Initial number of CPU is4Number of processor coresin the system is 4Number of processor socketsin the system is 2Private Interface'eth1:1' configured from GPnP foruse as a private interconnect.
[name='eth1:1', type=1, ip=169.254.146.181, mac=00-0c-29-60-50-8b, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface'eth0' configured from GPnP foruse as a public interface.
[name='eth0', type=1, ip=192.168.180.100, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
Public Interface'eth0:2' configured from GPnP foruse as a public interface.
[name='eth0:2', type=1, ip=192.168.180.111, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
Picked latch-free SCN scheme 3Autotune of undo retention is turned on.
LICENSE_MAX_USERS= 0SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release11.2.0.4.0 -64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options.
ORACLE_HOME= /u01/app/oracle/product/11.2.0/db_1
System name: Linux
Node name: rac1.example.com
Release:2.6.32-573.el6.x86_64
可以看到,日志开头全都是2020年07月02日的9点初。
我们再看看归档的情况,
在此之前,先删除之前的所有归档便于观察。
[oracle@rac2 ~]$ rman target /Recovery Manager: Release11.2.0.4.0 - Production on Fri Jul 3 00:20:01 2020Copyright (c)1982, 2011, Oracle and/or its affiliates. Allrights reserved.
connectedto target database: ZKM (DBID=4210149549)
RMAN> delete force noprompt archivelog all;
using targetdatabase control file instead ofrecovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=200 instance=zkm2 device type=DISKListof Archived Log Copies for database withdb_unique_name ZKM=====================================================================
KeyThrd Seq S Low Time------- ---- ------- - ---------
91 1 56 A 02-JUL-20Name:+ARCH/zkm/archivelog/2020_07_03/thread_1_seq_56.303.1044695873
92 2 21 A 02-JUL-20Name:+ARCH/zkm/archivelog/2020_07_02/thread_2_seq_21.299.1044695875deleted archivedlogarchivedlog file name=+ARCH/zkm/archivelog/2020_07_03/thread_1_seq_56.303.1044695873 RECID=91 STAMP=1044749873deleted archivedlogarchivedlog file name=+ARCH/zkm/archivelog/2020_07_02/thread_2_seq_21.299.1044695875 RECID=92 STAMP=1044695874Deleted2 objects
其中一个节点发出命令归档redo log,
SYS@zkm1> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
Session altered.
SYS@zkm1>SYS@zkm1> select sysdate fromdual;
SYSDATE------------------------
2020-07-03 00:20:27SYS@zkm1> alter system archive log current;
System altered.
观察警告日志,显示的时间是“Thu Jul 02 09:20:32 2020”。
Thu Jul 02 09:20:32 2020Thread1 advanced to log sequence 58(LGWR switch)Current log# 2 seq# 58 mem# 0: +DATA/zkm/onlinelog/group_2.258.1034889845
Current log# 2 seq# 58 mem# 1: +ARCH/zkm/onlinelog/group_2.258.1034889845Thu Jul02 09:20:32 2020ArchivedLog entry 93 added for thread 1 sequence 57 ID 0xfaf1eead dest 1:
进去ASM磁盘查看文件情况,
ASMCMD> ls -l
Type Redund Striped Time Sys Name
Y 2020_07_02/ASMCMD> ls -l *Type Redund Striped Time Sys Name
ARCHIVELOG UNPROT COARSE JUL02 09:00:00 Y thread_1_seq_57.299.1044696033ARCHIVELOG UNPROT COARSE JUL02 09:00:00 Y thread_2_seq_22.303.1044696033ASMCMD>pwd+arch/zkm/ARCHIVELOG
至此,问题完全重现成功。
原因
不知道各位发现没有,实际上最开始启动集群(crsctl start crs)那一瞬间(大概是2020-07-03 00:00:00多,刚改完时间不到1分钟),
集群的警告日志最开始出现的时间为“2020-07-02 09:00:26.667:”,基本相差15小时。
因此很容易也可以猜出,是由于时区的原因。
系统的时区为:
[grid@rac1 ~]$ cat /etc/sysconfig/clock
# Thetime zone of the system is defined by the contents of /etc/localtime.
# Thisfile is only for evaluation by system-config-date, donot rely on its
# contents elsewhere.
ZONE="Asia/Shanghai"
那么,集群应该也有自己的配置文件对应的时区,也就是集群不依赖OS的时区(-_-||)。
mos可以找到这么一篇文章:How To Change Timezone for Grid Infrastructure (Doc ID 1209444.1)
因此可以发现Grid Infrastructure的时区配置文件位置在$GRID_HOME/crs/install/s_crsconfig__env.txt。
所以,现象的模拟其实,就是我把该文件的时区改成如下了:
[grid@rac1 ~]$ cd $ORACLE_HOME/crs/install[grid@rac1install]$ lscmdllroot.sh crsconfig_lib.pm crsdelete.pm inittab installRemove.excl oracle-ohasd.conf paramfile.crs rootcrs.pl s_crsconfig_defs s_crsconfig_rac1_env.txt.bak
crsconfig_addparams.sbs crsconfig_params crspatch.pminstall.excl onsconfig oracle-ohasd.service ParentDirPerm_rac1.txt roothas.pl s_crsconfig_lib.pm tfa_setup.shcrsconfig_fileperms.excl crsconfig_params.sbs hasdconfig.plinstall.incl oraacfs.pm oracss.pm preupdate.sh rootofs.shs_crsconfig_rac1_env.txt
[grid@rac1install]$ cats_crsconfig_rac1_env.txt
### Thisfile can be used to modify the NLS_LANG environment variable, which determines the charset to be used formessages.
### For example, a new charset can be configured by setting NLS_LANG=JAPANESE_JAPAN.UTF8
### Do not modify thisfileexcept to change NLS_LANG, or under the direction of Oracle Support Services
TZ=US/Pacific
NLS_LANG=AMERICAN_AMERICA.AL32UTF8
TNS_ADMIN=ORACLE_BASE=
网上找找看是不是刚好和上海时间相差15个小时。
https://time.is/PT
得证!!!