引题:朋友一时兴起使用了rm**,删除了oracle数据文件后找我帮忙,我在帮朋友恢复数据库时,遇到了当recover时,报错不能找到28739号归档日志,这样我就不能同步scn,更不能打开数据库了。这是归档日志不连续的典型案例,我最后告诉他要做好心理准备。事情还没有完,这个真实案例引发了我的思考,如果当时在朋友没有做rman拯救措施的情况下,可不可能不使用rman即可恢复数据文件呢!最后我找到了答案:)
案例
1.系统solaris SunOS TJLT-YDWG6 5.9 Generic_122300-25 sun4u sparc SUNW,Sun-Fire-V890
DB OracleDatabase 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
2.案情描述
现场工程师使用了rm -rf *.dbf命令把所有的数据文件全部删除了
现在有5月4日的备份
restore database until time "to_date('2012-05-04 12:00:00','yyyy-mm-dd hh24:mi:ss')"; 进行恢复显示finish restore complete没有问题已经把文件 复制回来了
进行同步
RMAN> recover database until time "to_date('2012-05-04 11:00:00','yyyy-mm-dd hh24:mi:ss')";
Starting recover at 2012-07-26 14:02:42
using channel ORA_DISK_1
starting media recovery
unable to find archive log
archive log thread=1 sequence=28739 缺少28739号的归档日志,导致undotbs01.dbf文件不一致
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 2 needs more recovery to be consistent
ORA-01110: data file 2: '/opt/oradata/kpidb/undotbs01.dbf'
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 07/26/2012 14:02:51
RMAN-06054: media recovery requesting unknown log: thread 1 seq 28739 lowscn 1513525474
Leonarding
2012.7.26
在我们工作中可能会经常发生这样类似的突发状况,在遇到此情况下首先要做的就是冷静,上面发生的问题到了我这里之后,我就发现数据库已经变成了mount状态,在使用文件句柄方式恢复数据文件已经为时已晚,所以我采用了常规的恢复方式,没想到啊没想到,归档日志还不全,立马我整个人都“斯巴达”了,最后告诉朋友做DBA是需要勇气的。
下面我用自己的测试库演示一下操作系统rm级别的删除数据文件后,数据库仍然处于open状态的时候使用文件句柄来恢复被rm删除的数据文件,并最终顺利打开数据库的实验!我是一个比较严谨的人,所以一上来我先做个了“压缩全库备份”做到有备无患
描述一下场景:
操作系统:Enterprise Linux Enterprise Linux AS release 4 (October Update 8)
数据库版本:Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
一、备份重于一切
因此在演示之前我们先做一下备份
RMAN> show all;
RMAN配置参数区,如下都是默认值
RMAN configuration parameters are:
冗余配置保留政策:冗余数是1
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
开启增量备份:关
CONFIGURE BACKUP OPTIMIZATION OFF; # default
默认备份设备是磁盘
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
控制文件自动备份:关
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
控制文件自动备份目录和格式:%F 【备份设备:Disk】
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
备份的并行度:1,备份类型为备份集
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
数据文件采用复制方式备份
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
归档日志采用复制方式备份
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
最大值:无限制
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
加密数据库:关
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
加密算法采用AES128
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
归档日志删除策略:空
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
快照控制文件名
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default
配置RMAN默认备份介质存放目录到/home/oracle/backup
RMAN> configure channel device type disk format '/home/oracle/backup/DB_LEO_%U';
new RMAN configuration parameters:
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/home/oracle/backup/DB_LEO_%U';
new RMAN configuration parameters are successfully stored
配置控制文件自动备份并保存到/home/oracle/backup/control目录
RMAN> configure controlfile autobackup on; 启动控制文件自动备份
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored
RMAN> configure controlfile autobackup format for device type disk to 设置控制文件自动备份目录和格式
'/home/oracle/backup/control/cf_LEO_%F';
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO
'/home/oracle/backup/control/cf_LEO_%F';
new RMAN configuration parameters are successfully stored
显示配置后RMAN环境变量
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/home/oracle/backup/control/cf_LEO_%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/home/oracle/backup/DB_LEO_%U';
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default
使用rman命令压缩备份数据库
RMAN> backup as compressed backupset full database format
2> '/home/oracle/backup/full_bk1_%u%p%s.rmn' include current controlfile
3> plus
4> archivelog format '/home/oracle/backup/arch_bk1_%u%p%s.rmn' delete all input;(删除备份过的旧归档日志)
Starting backup at 26-JUL-12
current log archived
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=318 devtype=DISK
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
input archive log thread=1 sequence=5 recid=1 stamp=784059130
input archive log thread=1 sequence=6 recid=6 stamp=788723549
input archive log thread=1 sequence=7 recid=4 stamp=788723547
input archive log thread=1 sequence=8 recid=5 stamp=788723548
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/arch_bk1_01nh3iuq11.rmn tag=TAG20120726T235048 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:16
channel ORA_DISK_1: deleting archive log(s)
archive log filename=/home/oracle/arch/LEO/1_5_784052660.dbf recid=1 stamp=784059130
archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=6 stamp=788723549
archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=2 stamp=784059169
archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=4 stamp=788723547
archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=3 stamp=784059221
archive log filename=/home/oracle/arch/LEO/1_8_784052660.dbf recid=5 stamp=788723548
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
input archive log thread=1 sequence=1 recid=8 stamp=789695445
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/arch_bk1_02nh3ivb12.rmn tag=TAG20120726T235048 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:09
channel ORA_DISK_1: deleting archive log(s)
archive log filename=/home/oracle/arch/LEO/1_1_788725257.dbf recid=8 stamp=789695445
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
input archive log thread=1 sequence=1 recid=7 stamp=788725258
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/arch_bk1_03nh3ivk13.rmn tag=TAG20120726T235048 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
channel ORA_DISK_1: deleting archive log(s)
archive log filename=/home/oracle/arch/LEO/1_1_788723547.dbf recid=7 stamp=788725258
Finished backup at 26-JUL-12
Starting backup at 26-JUL-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00001 name=/u01/app/oracle/oradata/LEO/file1/system01.dbf
input datafile fno=00003 name=/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf
input datafile fno=00002 name=/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf
input datafile fno=00005 name=/u01/app/oracle/oradata/LEO/file1/example01.dbf
input datafile fno=00004 name=/u01/app/oracle/oradata/LEO/file1/users01.dbf
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/full_bk1_04nh3ivo14.rmn tag=TAG20120726T235119 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:04:56
channel ORA_DISK_1: starting compressed full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
including current control file in backupset
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/full_bk1_05nh3j9015.rmn tag=TAG20120726T235119 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 26-JUL-12
Starting backup at 26-JUL-12
current log archived
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed archive log backupset
channel ORA_DISK_1: specifying archive log(s) in backup set
input archive log thread=1 sequence=2 recid=9 stamp=789695779
channel ORA_DISK_1: starting piece 1 at 26-JUL-12
channel ORA_DISK_1: finished piece 1 at 26-JUL-12
piece handle=/home/oracle/backup/arch_bk1_06nh3j9316.rmn tag=TAG20120726T235619 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
channel ORA_DISK_1: deleting archive log(s)
archive log filename=/home/oracle/arch/LEO/1_2_788725257.dbf recid=9 stamp=789695779
Finished backup at 26-JUL-12
Starting Control File and SPFILE Autobackup at 26-JUL-12
piece handle=/home/oracle/backup/control/cf_LEO_c-1558319476-20120726-00 comment=NONE
Finished Control File and SPFILE Autobackup at 26-JUL-12
ll 看一下我们的备份文件已经生成,下面可以安心做实验啦
-rw-r----- 1 oracle oinstall 11180032 Jul 26 23:51 arch_bk1_01nh3iuq11.rmn
-rw-r----- 1 oracle oinstall 7031808 Jul 26 23:51 arch_bk1_02nh3ivb12.rmn
-rw-r----- 1 oracle oinstall 1030144 Jul 26 23:51 arch_bk1_03nh3ivk13.rmn
-rw-r----- 1 oracle oinstall 8704 Jul 26 23:56 arch_bk1_06nh3j9316.rmn
drwxr-xr-x 2 oracle oinstall 4096 Jul 26 23:56 control
-rw-r----- 1 oracle oinstall 57221120 Jul 26 23:56 full_bk1_04nh3ivo14.rmn
-rw-r----- 1 oracle oinstall 1097728 Jul 26 23:56 full_bk1_05nh3j9015.rmn
二、模拟数据文件删除
SYS@LEO> select status from v$instance; 查看数据库是否是open状态,是open没有问题
STATUS
------------
OPEN
SYS@LEO> select name from v$datafile; 查看存在的数据文件
NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/LEO/file1/system01.dbf
/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf
/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf
/u01/app/oracle/oradata/LEO/file1/users01.dbf
/u01/app/oracle/oradata/LEO/file1/example01.dbf
SYS@LEO> host rm /u01/app/oracle/oradata/LEO/file1/example01.dbf 删除example01.dbf数据文件
我们检查一下删除example01.dbf文件后数据库是不是还处在open状态
SYS@LEO> select status from v$instance; 呵呵 就像我们在开始说的一下,open木有问题
STATUS
------------
OPEN
我们检查一下删除example01.dbf文件后表空间处于什么状态呢?
SYS@LEO> select TABLESPACE_NAME,status from dba_tablespaces;
TABLESPACE_NAME STATUS
------------------------------ ---------
SYSTEM ONLINE
UNDOTBS ONLINE
SYSAUX ONLINE
TEMPTS1 ONLINE
USERS ONLINE
TEMP1 ONLINE
TEMP2 ONLINE
EXAMPLE ONLINE 我们看到EXAMPLE还是联机状态,为什么是这样呢,我们不是已经把文件删除了嘛!对喽可能有人已经想到了,我们是从操作系统层面上直接删除的(没有通过数据库删除),所以数据库此时还认为这个文件没有变化,数据字典中记录的还是完好的状态
conn ls/ls 切换用户
数据文件被删除,所以创建表失败(当我们往数据文件上写数据时,数据库才会检查文件状态)
LS@LEO> create table t1 tablespace example as select * from liusheng;
create table t1 tablespace example as select * from liusheng
ERROR at line 1:
ORA-01116: error in opening database file 5
ORA-01110: data file 5: '/u01/app/oracle/oradata/LEO/file1/example01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
注:如果我们在其他数据文件上创建表时,是可以正常创建的,因为其他数据文件没有被破坏
例LS@LEO> create table t2 tablespace users as select * from liusheng;
Table created.
三、通过文件句柄恢复数据文件
查询dbwr进程pid
[oracle@secdb1 bdump]$ ps -ef|grep dbw|grep -v grep
oracle 26745 1 0 Jul26 ? 00:00:01 ora_dbw0_LEO
注:1.可能有的朋友对 grep –v grep这句不是很明白,-v选项反向过滤oracle 2405 421 0 01:03 pts/9 00:00:00 grep dbw这行记录的意思,只保留ora_dbw0_LEO这行记录,方便观察
2. dbwr进程会打开所有访问数据文件的句柄。在proc目录中可以查到,目录名是进程PID,fd(file discription)表示文件描述符。
cd /proc/26745/fd
ls -l
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 0 -> /dev/null
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 10 -> /u01/app/oracle/admin/LEO/adump/ora_26737.aud
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 11 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 12 -> /dev/zero
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 13 -> /u01/app/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 14 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 15 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkLEO
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 16 -> /u01/app/oracle/oradata/LEO/file1/control01.ctl
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 17 -> /u01/app/oracle/oradata/LEO/file2/control02.ctl
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 18 -> /u01/app/oracle/oradata/LEO/file3/control03.ctl
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 19 -> /u01/app/oracle/oradata/LEO/file1/system01.dbf
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 20 -> /u01/app/oracle/oradata/LEO/file1/undotbs01.dbf
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 21 -> /u01/app/oracle/oradata/LEO/file1/sysaux01.dbf
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 22 -> /u01/app/oracle/oradata/LEO/file1/users01.dbf
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted)
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 24 -> /u01/app/oracle/oradata/LEO/file1/temp01.dbf
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 25 -> /u01/app/oracle/oradata/LEO/file1/temp1_01.dbf
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 26 -> /u01/app/oracle/oradata/LEO/file1/temp1_02.dbf
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Jul 27 01:09 4 -> /dev/null
l-wx------ 1 oracle oinstall 64 Jul 27 01:09 5 -> /u01/app/oracle/admin/LEO/udump/leo_ora_26737.trc
l-wx------ 1 oracle oinstall 64 Jul 27 01:09 6 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 7 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkinstLEO (deleted)
l-wx------ 1 oracle oinstall 64 Jul 27 01:09 8 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log
lrwx------ 1 oracle oinstall 64 Jul 27 01:09 9 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat
注:“23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted)”被删除数据文件会被标示(deleted),我们使用的是redhat linux,如果在solaris系统中使用lsof命令查看句柄和文件对应关系,我们通过copy文件句柄方式恢复数据文件到原位置
cp /proc/26745/fd/23 /u01/app/oracle/oradata/LEO/file1/example01.dbf
确认example01.dbf数据文件已经恢复成功
ll /u01/app/oracle/oradata/LEO/file1/example01.dbf
-rw-r----- 1 oracle oinstall 209723392 Jul 27 01:22 /u01/app/oracle/oradata/LEO/file1/example01.dbf
四、数据文件recover
LS@LEO> alter database datafile 5 offline; 先把example01.dbf文件脱机
Database altered.
LS@LEO> recover datafile 5; 完成介质恢复,实质同步控制文件、redo日志scn号
Media recovery complete.
LS@LEO> alter database datafile 5 online; 再把example01.dbf文件联机
Database altered.
五、测试是否可以正常创建表t1
LS@LEO> create table t1 tablespace example as select * from liusheng;
Table created. 成功创建
LS@LEO> select * from t1;
ORDERID NAME LS_DATE
---------- ---------- -------------------
1 ls1 1981-01-02 00:00:00
1 ls2 1998-01-03 00:00:00
1 ls3 1999-01-04 00:00:00
1 ls4 2000-01-05 00:00:00
1 ls5 2000-01-06 00:00:00
1 ls6 2001-01-07 00:00:00
1 ls7 2001-01-08 00:00:00
1 ls8 2002-01-09 00:00:00
1 ls9 2002-01-10 00:00:00
1 ls10 2011-01-11 00:00:00
10 rows selected.
小结:当我们在Linux系统中不小心rm了数据文件时,一定要冷静,不要做关闭数据库、重启操作系统等危险操作,因为在不了解数据库运行状态的前提下做这些往往是徒劳的,还可能造成无法挽回的后果。此时我们不妨在数据库open状态下使用文件句柄方式来恢复被我们rm掉的数据文件。因为只要数据文件被某个进程使用着,那么这个进程就会一直持有这个数据文件的句柄,那么所指向的数据文件依然可以读写,我们可以从proc->pid->fd目录中找到被删除的数据文件句柄(deleted)copy到原位置来恢复,这也是在恢复窗口时间很短的情况下快速恢复的好方法。
oracle视频教程请关注:http://u.youku.com/user_video/id_UMzAzMjkxMjE2.html
转载于:https://blog.51cto.com/19880614/1281290