刚到公司同事就找我救急,说某个生产库无法访问了。查看了一下alert log,发现有以下错误信息:
Unable to allocate flashback log of 1946 blocks from
current recovery area of size 128849018880 bytes.
看来是FLASHBACK AREA空间不足导致数据库挂起的,不管怎样先将数据库恢复服务:
nolog> shutdown abort
ORACLE instance shut down.
nolog> startup mount
ORACLE instance started.
Total System Global Area 3.4360E+10 bytes
Fixed Size 2159376 bytes
Variable Size 7214206192 bytes
Database Buffers 2.7129E+10 bytes
Redo Buffers 14614528 bytes
Database mounted.
nolog> alter system set db_recovery_file_dest_size=200G scope=spfile;
System altered.
nolog> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
nolog> startup;
ORACLE instance started.
Total System Global Area 3.4360E+10 bytes
Fixed Size 2159376 bytes
Variable Size 7214206192 bytes
Database Buffers 2.7129E+10 bytes
Redo Buffers 14614528 bytes
Database mounted.
Database opened.
接下来查找原因,FLASHBACK AREA空间怎么会不足呢?
SQL> select * from V$FLASH_RECOVERY_AREA_USAGE;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE 0 0 0
ONLINELOG 0 0 0
ARCHIVELOG 0 0 0
BACKUPPIECE 0 0 0
IMAGECOPY 0 0 0
FLASHBACKLOG 60.14 0 8098
可以看到是FLASHBACKLOG占用了大量的空间
08:22:26 oracle[SQLPLUS]@cdbcg101[wwddkfs1]> show parameter flash
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target integer 1440
但是FALSHBACK RETENTION只设置为1440分钟,而最早的FLASHBACKUPLOG居然是好几个月之前的:
$ ls -ltr
total 127166476
-rw-r----- 1 oracle dba 15949824 Jul 9 22:00 o1_mf_7yw9ccjr_.flb
-rw-r----- 1 oracle dba 15949824 Jul 9 23:16 o1_mf_7y4jgnr7_.flb
-rw-r----- 1 oracle dba 15949824 Jul 10 04:01 o1_mf_7y5g3rvo_.flb
-rw-r----- 1 oracle dba 15949824 Jul 10 10:23 o1_mf_7z5qq9jo_.flb
-rw-r----- 1 oracle dba 15949824 Jul 10 10:26 o1_mf_7y6b72hn_.flb
-rw-r----- 1 oracle dba 15949824 Jul 10 15:00 o1_mf_7y34gjrs_.flb
-rw-r----- 1 oracle dba 15949824 Jul 10 21:00 o1_mf_7z3ch1nj_.flb
。。。
也就是说db_flashback_retention_target根本就没起作用。依稀记得很早之前生成过一个GUARANTEE RESTORE POINT,
SQL> select * from v$restore_point;
SCN DATABASE_INCARNATION# GUA STORAGE_SIZE TIME
---------- --------------------- --- ------------ ---------------------------------------------------------------------------
NAME
--------------------------------------------------------------------------------------------------------------------------------
6.0245E+12 1 YES 1.2998E+11 09-JUL-12 03.52.02.000000000 PM
INIT_STATUS_POINT
果然这个RESTORE POINT还存着,一直没删除,那FLASHBACK空间不足很可能就是由它导致的,和相关人员确认后,该RESTORE POINT早就已经不需要了,但是一直遗漏没删除。
SQL> drop restore point INIT_STATUS_POINT;
Restore point dropped.
SQL> select * from V$FLASH_RECOVERY_AREA_USAGE;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
------------ ------------------ ------------------------- ---------------
CONTROLFILE 0 0 0
ONLINELOG 0 0 0
ARCHIVELOG 0 0 0
BACKUPPIECE 0 0 0
IMAGECOPY 0 0 0
FLASHBACKLOG .93 0 113
删除GUARANTEE RESTORE POINT后,FLASHBACK的空间也立即就降下来了。至此问题解决。
日常工作还是需要细致,一些看上去小的遗留问题日后很可能就是会造成严重后果的隐患。还好今天碰到的库是非核心业务的,还没造成大影响,不幸中的万幸。