最近单位的ORACLE数据库运行不稳定,通过查询oracle的alert日志,发现如下报错内容:

Thread 1 cannot allocate new log, sequence 108922
Checkpoint not complete
  Current log# 1 seq# 108921 mem# 0: +DATADG/apts/onlinelog/group_1.262.788509509
  Current log# 1 seq# 108921 mem# 1: +ARCHDG/apts/onlinelog/group_1.257.788509511
Thread 1 advanced to log sequence 108922 (LGWR switch)
  Current log# 2 seq# 108922 mem# 0: +DATADG/apts/onlinelog/group_2.261.788509511
  Current log# 2 seq# 108922 mem# 1: +ARCHDG/apts/onlinelog/group_2.258.788509511
Sat Apr 27 10:25:38 2013
Thread 1 cannot allocate new log, sequence 108923
Checkpoint not complete
  Current log# 2 seq# 108922 mem# 0: +DATADG/apts/onlinelog/group_2.261.788509511
  Current log# 2 seq# 108922 mem# 1: +ARCHDG/apts/onlinelog/group_2.258.788509511
Thread 1 advanced to log sequence 108923 (LGWR switch)
  Current log# 1 seq# 108923 mem# 0: +DATADG/apts/onlinelog/group_1.262.788509509
  Current log# 1 seq# 108923 mem# 1: +ARCHDG/apts/onlinelog/group_1.257.788509511

 

初步判定是ORACLE联机重做日志太小,白天生产库太忙,日志切换频繁造成,由于安装ORACLE时,REDO是按默认设置安装(默认设置是每实例2组联机日志,每组两个成员,每个日志文件50M),未对联机重做日志进行修改,后期由于业务量大幅增加,原有的设置以无法满足现有需要,需对联机日志文件进行扩容。

 

思路如下:

1、联机重做日志的三种状态

(1)、ACTIVE:活动状态

(2)、INACTIVE:未被使用状态

(3)、CURRENT:当前状态(使用状态)

2、计划对每实例2组联机日志扩充为每实例4组联机日志

3、计划对每个日志文件由50M扩充为200M

 

注意事项:

1、删除联机日志时,必须保证每个实例有两个日志文件

2、增加联机日志时,脚本中要比单实例增加thread参数来标识某个实例

3、主要使用的动态性能视图有v$log和v$logfile

4、联机日志在删除时必须是INACTIVE状态

5、需在夜晚业务量下降时实施

 

实施步骤如下:

1、用tail -f alert*.ora命令进一步证实了上述判断,发现联机重做日志切换时间为3分钟左右,日志再次切换时,日志文件状态为ACTIVE,并非是INACTIVE(未使用状态)

2、查询日志组状态

select * from v$log order by thread#,group#;

查询日志各成员信息

select * from v$logfile;

 

3、必要文件备份

(1)、备份初始化参数文件

create pfile='/home/oracle/bak_pfile_20130427_add_redolog.ora' from spfile;

(2)、备份控制文件

alter database backup controlfile to /home/oracle/bak_controlfile_20130427_add_redolog.ora';

4、增加联机日志文件

alter database add logfile thread 1 group 5('+DATADG/apts/onlinelog/group5_1','+ARCHDG/apts/onlinelog/group5_2') size 200M;
alter database add logfile thread 2 group 6('+DATADG/apts/onlinelog/group6_1','+ARCHDG/apts/onlinelog/group6_2') size 200M;

alter database add logfile thread 1 group 7('+DATADG/apts/onlinelog/group7_1','+ARCHDG/apts/onlinelog/group7_2') size 200M;
alter database add logfile thread 2 group 8('+DATADG/apts/onlinelog/group8_1','+ARCHDG/apts/onlinelog/group8_2') size 200M;

 

5、查看增加的文件信息

select * from v$log order by thread#,group#;

查询日志各成员信息

select * from v$logfile;

 

6、将联机日志文件切换到新增加的文件上使用,使1、2、3、4组联机日志文件状态变为INACTIVE

alter system switch logfile;

alter system checkpoint;

 

7、删除1、2、3、4组联机日志文件

alter database drop logfile group 1;

alter database drop logfile group 2;

alter database drop logfile group 3;

alter database drop logfile group 4;

这里发现一个问题,如果联机日志文件时自己创建的,有可能在使用删除命令后,在磁盘阵列中的文件并没有被删除,从而导致空间浪费及重建同名组是报错,如图:

 

这时需要用ORACLE自带的ASM维护命令asmcmd进入到磁盘阵列中去删除响应的文件即可,如图:

 

8、重建联机日志1,2,3,4组并分配200M的空间

alter database add logfile thread 1 group 1('+DATADG/apts/onlinelog/group1_1','+ARCHDG/apts/onlinelog/group1_2') size 200M;
alter database add logfile thread 2 group 2('+DATADG/apts/onlinelog/group2_1','+ARCHDG/apts/onlinelog/group2_2') size 200M;

alter database add logfile thread 1 group 3('+DATADG/apts/onlinelog/group3_1','+ARCHDG/apts/onlinelog/group3_2') size 200M;
alter database add logfile thread 2 group 4('+DATADG/apts/onlinelog/group4_1','+ARCHDG/apts/onlinelog/group4_2') size 200M;
 

9、使用命令切换日志文件,看是否正常

alter system switch logfile;

select * from v$log order by thread#,group#;

select * from v$logfile;

alter system checkpoint;

如图:

 

 

 

 

操作完毕。