解决实例stuck Archiver
其实,这个问题非常好解决,其原因就是因为,数据库开启了归档且归档日志目录满导致的。
还有一种原因,通常,我们使用fast recovery area作为归档日志的存放目录,当前的物理空间还很大,而db_recovery_file_dest_size参数却设置的小了,也会导致这个现像的产生。
下面,我们来看一下这个现象。
我当前资源的状态:
[grid@djp01 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DBFILE1.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.FRA1.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.LISTENER.lsnr
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.OCRVDISK.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.asm
ONLINE ONLINE djp01 Started
ONLINE ONLINE djp02 Started
ora.gsd
OFFLINE OFFLINE djp01
OFFLINE OFFLINE djp02
ora.net1.network
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.ons
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.registry.acfs
ONLINE ONLINE djp01
ONLINE ONLINE djp02
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE djp01
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE djp02
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE djp02
ora.cvu
1 ONLINE ONLINE djp02
ora.djp01.vip
1 ONLINE ONLINE djp01
ora.djp02.vip
1 ONLINE ONLINE djp02
ora.oc4j
1 ONLINE ONLINE djp02
ora.orarac.db
1 ONLINE ONLINE djp01 Open
2 ONLINE ONLINE djp02 Open
ora.scan1.vip
1 ONLINE ONLINE djp01
ora.scan2.vip
1 ONLINE ONLINE djp02
ora.scan3.vip
1 ONLINE ONLINE djp02
[grid@djp01 ~]$
数据库的日志模式:
SQL> select log_mode from v$database;
LOG_MODE
------------
NOARCHIVELOG
SQL>
下面,我们来开户归档。
SQL> ! srvctl stop database -d orarac
SQL> ! srvctl start database -d orarac -o mount
SQL> conn / as sysdba
Connected.
SQL> alter database archivelog
2 /
Database altered.
SQL> alter database open
2 /
Database altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
[oracle@djp01 ~]$ ssh djp02
Last login: Sat Jan 24 05:59:13 2015 from djp01
[oracle@djp02 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sat Jan 24 06:12:40 2015
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> alter database open
2 /
Database altered.
SQL> select inst_id,log_mode from gv$database
2 /
INST_ID LOG_MODE
---------- ------------
2 ARCHIVELOG
1 ARCHIVELOG
SQL>
归档启用完成。
下面,我们来设置一下db_recovery_file_dest_size参数的大小,然后手动进行一下日志的归档。
SQL> alter system set db_recovery_file_dest_size=6M sid='*' scope=both
2 /
System altered.
SQL>
我把fast recovery area的大小设置为6M。下面,我们来进行日志的归档。
SQL> alter system archive log current;
alter system archive log current
*
ERROR at line 1:
ORA-16038: log 3 sequence# 17 cannot be archived
ORA-19809: limit exceeded for recovery files
ORA-00312: online log 3 thread 2:
'+DBFILE1/orarac/onlinelog/group_3.265.866332825'
ORA-00312: online log 3 thread 2:
'+FRA1/orarac/onlinelog/group_3.259.866332829'
SQL> ! oerr ora 19809
19809, 00000, "limit exceeded for recovery files"
//*Cause: The limit for recovery files specified by the
// DB_RECOVERY_FILE_DEST_SIZE was exceeded.
// *Action: There are five possible solutions:
// 1) Take frequent backup of recovery area using RMAN.
// 2) Consider changing RMAN retention policy.
// 3) Consider changing RMAN archived log deletion policy.
// 4) Add disk space and increase DB_RECOVERY_FILE_DEST_SIZE.
// 5) Delete files from recovery area using RMAN.
SQL>
它直接出现了错误。
alert 日志中的一段:
************************************************************************
ARC0: Error 19809 Creating archive log file to '+FRA1'
Errors in file /u01/app/oracle/diag/rdbms/orarac/orarac2/trace/orarac2_arc1_14866.trc:
ORA-19815: WARNING: db_recovery_file_dest_size of 6291456 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
system command was used to delete files, then use RMAN CROSSCHECK and
DELETE EXPIRED commands.
************************************************************************
通过上述的显示,Oracle已经给出了具体的原因。我们只要检查上述的一些项,按照结果处理就可以了。
CRS资源的状态:
[grid@djp01 ~]$ crsctl status res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DBFILE1.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.FRA1.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.LISTENER.lsnr
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.OCRVDISK.dg
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.asm
ONLINE ONLINE djp01 Started
ONLINE ONLINE djp02 Started
ora.gsd
OFFLINE OFFLINE djp01
OFFLINE OFFLINE djp02
ora.net1.network
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.ons
ONLINE ONLINE djp01
ONLINE ONLINE djp02
ora.registry.acfs
ONLINE ONLINE djp01
ONLINE ONLINE djp02
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE djp01
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE djp02
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE djp02
ora.cvu
1 ONLINE ONLINE djp02
ora.djp01.vip
1 ONLINE ONLINE djp01
ora.djp02.vip
1 ONLINE ONLINE djp02
ora.oc4j
1 ONLINE ONLINE djp02
ora.orarac.db
1 ONLINE INTERMEDIATE djp01 Stuck Archiver
2 ONLINE INTERMEDIATE djp02 Stuck Archiver
ora.scan1.vip
1 ONLINE ONLINE djp01
ora.scan2.vip
1 ONLINE ONLINE djp02
ora.scan3.vip
1 ONLINE ONLINE djp02
[grid@djp01 ~]$
我们把db_recovery_file_dest_size参数设置的大一些,然后进行手动归档:
SQL> alter system set db_recovery_file_dest_size=10G sid='*' scope=both
2 /
System altered.
SQL> alter system archive log current;
System altered.
SQL>
归档成功。这里,资源的状态恢复为正常,如下:
ora.orarac.db
1 ONLINE ONLINE djp01 Open
2 ONLINE ONLINE djp02 Open
###############################################################################################
---查看Rac数据库状态
[grid@www.solgle.com-1 bin]$ ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.DATA2.dg
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.LISTENER.lsnr
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.asm
ONLINE ONLINE www.solgle.com-1 Started
ONLINE ONLINE www.solgle.com-2 Started
ora.gsd
OFFLINE OFFLINE www.solgle.com-1
OFFLINE OFFLINE www.solgle.com-2
ora.net1.network
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.ons
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE www.solgle.com-1
ora.cvu
1 ONLINE ONLINE www.solgle.com-1
ora.solgle.db
1 ONLINE INTERMEDIATE www.solgle.com-1 Stuck Archiver
2 ONLINE ONLINE www.solgle.com-2 Open
ora.oc4j
1 ONLINE ONLINE www.solgle.com-1
ora.www.solgle.com-1.vip
1 ONLINE ONLINE www.solgle.com-1
ora.rac2.vip
1 ONLINE ONLINE www.solgle.com-2
ora.scan1.vip
1 ONLINE ONLINE www.solgle.com-1
[grid@www.solgle.com-1 bin]$
--www.solgle.com-1归档出错,下面进行处理
---查看空间使用情况
C:\Users\user>sqlplus sys/********@solgle_db as sysdba;
SQL*Plus: Release 11.2.0.1.0 Production on Thu Oct 23 09:04:30 2014
Copyright (c) 1982, 2010, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Managemen
OLAP,Data Mining and Real Application Testing options
SQL> set linesize 100
SQL> show parameter db_recovery
NAME TYPE VALUE
------------------------------------ -------------------------------- ----------
--------------------
db_recovery_file_dest string +data2
db_recovery_file_dest_size big integer 40G
SQL>
SQL> select sum(percent_space_used) from v$flash_recovery_area_usage;
SUM(PERCENT_SPACE_USED)
-----------------------
99.98
SQL>
---一般遇到这种情况,要么扩大空间,要么删除文件
----查看asm磁盘组磁盘空间使用情况
[grid@www.solgle.com-1 bin]$ ./asmcmd
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 4096 2097152 137708 54984 0 54984 0 Y DATA1/
MOUNTED EXTERN N 512 4096 1048576 41239 20 0 20 0 N DATA2/
ASMCMD> exit
[grid@www.solgle.com-1 bin]$
---如果是windows server,可以cmd->asmca来查看空间使用情况
---free_MB 是磁盘剩余空间,20M已经很少了,当然可以添加磁盘,但这里选择删除文件来释放空间
C:\Users\user>rman target=sys/********@solgle_db nocatalog
恢复管理器: Release 11.2.0.1.0 - Production on 星期四 10月 23 09:03:24 2014
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
连接到目标数据库: solgle (DBID=1369758280)
使用目标数据库控制文件替代恢复目录
RMAN> delete noprompt obsolete;
RMAN 保留策略将应用于该命令
将 RMAN 保留策略设置为冗余 1
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=71 实例=solgle2 设备类型=DISK
删除以下已废弃的备份和副本:
类型 关键字 完成时间 文件名/句柄
-------------------- ------ ------------------ --------------------
存档日志 5284 09-10月-14 +DATA2/solgle/archivelog/2014_10_09/
thread_2_seq_1774.1392.860515205
存档日志 5285 09-10月-14 +DATA2/solgle/archivelog/2014_10_09/
thread_1_seq_3558.691.860529621
存档日志 5286 09-10月-14 +DATA2/solgle/archivelog/2014_10_09/
thread_1_seq_3559.514.860540419
存档日志 5287 10-10月-14 +DATA2/solgle/archivelog/2014_10_10/
thread_2_seq_1775.480.860562593
备份集 26 10-10月-14
备份片段 26 10-10月-14 +DATA2/solgle/autobackup/2014_10_10/s_8
60581333.474.860581341
备份集 27 20-10月-14
备份片段 27 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61448746.541.861448755
备份集 28 20-10月-14
备份片段 28 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61452127.1304.861452135
备份集 29 20-10月-14
备份片段 29 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61460159.1194.861460167
备份集 30 20-10月-14
备份片段 30 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61464063.1154.861464069
备份集 31 20-10月-14
备份片段 31 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61467066.1239.861467073
备份集 32 20-10月-14
备份片段 32 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61467668.515.861467677
备份集 33 20-10月-14
备份片段 33 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61468370.531.861468377
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_09/thread_2_seq_1774.1392.860515
205 RECID=5284 STAMP=860515210
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_09/thread_1_seq_3558.691.8605296
21 RECID=5285 STAMP=860529628
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_09/thread_1_seq_3559.514.8605404
19 RECID=5286 STAMP=860540426
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_10/thread_2_seq_1775.480.8605625
93 RECID=5287 STAMP=860562599
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_10/s_860581333.474.860581341 RECID
=26 STAMP=860581340
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861448746.541.861448755 RECID
=27 STAMP=861448753
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861452127.1304.861452135 RECI
D=28 STAMP=861452135
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861460159.1194.861460167 RECI
D=29 STAMP=861460166
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861464063.1154.861464069 RECI
D=30 STAMP=861464069
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861467066.1239.861467073 RECI
D=31 STAMP=861467072
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861467668.515.861467677 RECID
=32 STAMP=861467676
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861468370.531.861468377 RECID
=33 STAMP=861468376
12 对象已删除
RMAN>
---再次查看空间使用率
SQL> select sum(PERCENT_SPACE_USED) from v$flash_recovery_area_usage;
SUM(PERCENT_SPACE_USED)
-----------------------
99.91
SQL>
----99.91% ,说明删除冗余后,剩余空间还是没有释放很多
---下面继续删除归档文件
RMAN> delete noprompt archivelog all;
释放的通道: ORA_DISK_1
分配的通道: ORA_DISK_1
通道 ORA_DISK_1: SID=71 实例=solgle2 设备类型=DISK
db_unique_name 为 solgle 的数据库的归档日志副本列表
=====================================================================
5821 2 2080 A 22-10月-14
名称: +DATA2/solgle/archivelog/2014_10_22/thread_2_seq_2080.265.861654899
5826 2 2081 A 22-10月-14
名称: +DATA2/solgle/archivelog/2014_10_22/thread_2_seq_2081.872.86165741
5
5829 2 2082 A 22-10月-14
名称: +DATA2/solgle/archivelog/2014_10_22/thread_2_seq_2082.880.86166566
5
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_10/thread_1_seq_3560.860.8605836
09 RECID=5288 STAMP=860583616
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_10/thread_1_seq_3561.724.8606268
27 RECID=5291 STAMP=860626833
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_11/thread_1_seq_3562.1080.860652
157 RECID=5292 STAMP=860652164
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_11/thread_1_seq_3563.986.8606745
11 RECID=5294 STAMP=860674519
已删除的归档日志
归档日志文件名=+DATA2/solgle/archivelog/2014_10_11/thread_1_seq_3564.907.8606745
51 RECID=5295 STAMP=860674560
... ...
---rman 删除仍在继续,下面查询空间使用率
SQL> select sum(percent_space_used) from v$flash_recovery_area_usage;
SUM(PERCENT_SPACE_USED)
-----------------------
60.7
SQL>
----空间还会逐渐释放出来,下面可以开始启动实例www.solgle.com-1了
[grid@www.solgle.com-1 bin]$ ./srvctl start instance -d solgle -n www.solgle.com-1
PRCC-1015 : solgle was already running on www.solgle.com-1
PRCR-1004 : Resource ora.solgle.db is already running
----查看状态,还用不着执行启动www.solgle.com-1命令,已经自动启起来了
[grid@www.solgle.com-1 bin]$ ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.DATA2.dg
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.LISTENER.lsnr
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.asm
ONLINE ONLINE www.solgle.com-1 Started
ONLINE ONLINE www.solgle.com-2 Started
ora.gsd
OFFLINE OFFLINE www.solgle.com-1
OFFLINE OFFLINE www.solgle.com-2
ora.net1.network
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
ora.ons
ONLINE ONLINE www.solgle.com-1
ONLINE ONLINE www.solgle.com-2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE www.solgle.com-1
ora.cvu
1 ONLINE ONLINE www.solgle.com-1
ora.solgle.db
1 ONLINE ONLINE www.solgle.com-1 Open
2 ONLINE ONLINE www.solgle.com-2 Open
ora.oc4j
1 ONLINE ONLINE www.solgle.com-1
ora.www.solgle.com-1.vip
1 ONLINE ONLINE www.solgle.com-1
ora.rac2.vip
1 ONLINE ONLINE www.solgle.com-2
ora.scan1.vip
1 ONLINE ONLINE www.solgle.com-1
[grid@www.solgle.com-1 bin]$
----因为删除了所有归档,所以现在备份下
RMAN> backup as compressed backupset database;
启动 backup 于 23-10月-14
使用通道 ORA_DISK_1
使用通道 ORA_DISK_1
通道 ORA_DISK_1: 正在启动压缩的全部数据文件备份集
通道 ORA_DISK_1: 正在指定备份集内的数据文件
输入数据文件: 文件号=00003 名称=+DATA1/solgle/datafile/undotbs1.258.855682813
输入数据文件: 文件号=00022 名称=+DATA1/solgle/datafile/tbs_data_part_17.284.856869153
... ...
----再次删除冗余
RMAN> delete noprompt obsolete;
RMAN 保留策略将应用于该命令
将 RMAN 保留策略设置为冗余 1
使用通道 ORA_DISK_1
删除以下已废弃的备份和副本:
类型 关键字 完成时间 文件名/句柄
-------------------- ------ ------------------ --------------------
备份集 25 10-10月-14
备份片段 25 10-10月-14 +DATA2/solgle/backupset/2014_10_10/nnnd
f0_tag20141010t092937_0.1269.860578185
备份集 34 20-10月-14
备份片段 34 20-10月-14 +DATA2/solgle/autobackup/2014_10_20/s_8
61469169.527.861469177
已删除备份片段
备份片段句柄=+DATA2/solgle/backupset/2014_10_10/nnndf0_tag20141010t092937_0.1269
.860578185 RECID=25 STAMP=860578183
已删除备份片段
备份片段句柄=+DATA2/solgle/autobackup/2014_10_20/s_861469169.527.861469177 RECID
=34 STAMP=861469176
2 对象已删除
RMAN> exit
恢复管理器完成。
C:\Users\user>
--实例恢复成功