Eygle的最近的一本dba手记4上面说到了利用文件描述符恢复意外删除的数据文件,虽然这本书最近还没有入手,翻阅了一下文档做了一个测试,数据得到完全恢复,记录如下:
[oracle@server119 admin]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Aug 21 15:11:31 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> create tablespace rm datafile '/db2/oracle10g/oradata/rm01.dbf' size 50m;
Tablespace created.
SQL> conn xiaoyu/xiaoyu
Connected.
SQL> create table rm01 tablespace rm as select * from dba_Objects;
Table created.
SQL> select count(*) from rm01;
COUNT(*)
----------
52653
创建测试的数据文件/db2/oracle10g/oradata/rm01和表rm01
[oracle@server119 ~]$ rm -rf /db2/oracle10g/oradata/rm01.dbf
Os上删除数据文件。
[oracle@server119 ~]$ sqlplus xiaoyu/xiaoyu
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Aug 21 15:29:04 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select count(*) from rm01;
COUNT(*)
----------
52653
这儿系统读取的是内存的信息,如果刷新一下cache buffer,也就是标记内存的buffer为过期的。
SQL> alter system flush buffer_cache;
System altered.
SQL> select count(*) from rm01;
select count(*) from rm01
*
ERROR at line 1:
ORA-01116: error in opening database file 574
ORA-01110: data file 574: '/db2/oracle10g/oradata/rm01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
存在两个进程dbwn来对数据文件进行操作。
[root@server119 fd]# lsof |grep /db2/oracle10g/oradata/rm01.dbf
oracle 2199 oracle 594uW REG 8,17 52436992 12648449 /db2/oracle10g/oradata/rm01.dbf (deleted)
oracle 2201 oracle 19u REG 8,17 52436992 12648449 /db2/oracle10g/oradata/rm01.dbf (deleted)
[root@server119 fd]# ps -ef|grep 2199
oracle 2199 1 0 15:24 ? 00:00:00 ora_dbw0_benguo
root 12226 12798 0 15:47 pts/11 00:00:00 grep 2199
[root@server119 fd]# ps -ef|grep 2201
oracle 2201 1 0 15:24 ? 00:00:00 ora_dbw1_benguo
root 12267 12798 0 15:47 pts/11 00:00:00 grep 2201
关于rm其实只是删除了文件的连接inode,文件中所占用的block并没有被删除,这点到和oracle的truncate和drop原理差不多,可以看出系统的进程并没有释放,如果此时我们不关闭数据库,那么这个数据文件是可以完全恢复出来的,而如果关闭数据文件的连接进程则会释放,可能引起数据丢失。
[root@server119 fd]# cp -a /proc/2199/fd/594 /db2/oracle10g/oradata/rm01.dbf
cp: 无法创建符号链接 “/db2/oracle10g/oradata/rm01.dbf”: 权限不够
[root@server119 fd]# cp /proc/2199/fd/594 /db2/oracle10g/oradata/rm01.dbf
[root@server119 fd]# ls -l /db2/oracle10g/oradata/rm01.dbf
-rw-r----- 1 root root 52436992 08-21 15:54 /db2/oracle10g/oradata/rm01.dbf
[root@server119 fd]# chown oracle:dba /db2/oracle10g/oradata/rm01.dbf
[root@server119 fd]# stat /db2/oracle10g/oradata/rm01.dbf
File: “/db2/oracle10g/oradata/rm01.dbf”
Size: 52436992 Blocks: 102536 IO Block: 4096 一般文件
Device: 811h/2065d Inode: 12648450 Links: 1
Access: (0640/-rw-r-----) Uid: ( 500/ oracle) Gid: ( 501/ dba)
Access: 2012-08-21 15:54:18.000000000 +0800
Modify: 2012-08-21 15:54:18.000000000 +0800
Change: 2012-08-21 15:54:35.000000000 +0800
[root@server119 fd]# su - oracle
[oracle@server119 ~]$ sqlplus xiaoyu/xiaoyu
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Aug 21 15:55:34 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select count(*) from rm01;
COUNT(*)
----------
52653
而如果关闭了数据库后由于进程已经释放,已经无法利用文件描述符恢复了。
SQL> startup force;
ORACLE instance started.
Total System Global Area 2.6844E+10 bytes
Fixed Size 2144984 bytes
Variable Size 855639336 bytes
Database Buffers 2.5971E+10 bytes
Redo Buffers 14630912 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 574 - see DBWR trace file
ORA-01110: data file 574: '/db2/oracle10g/oradata/rm01.dbf'
此时lsof列出进程打开的文件中并没有当时意外rm掉的文件。
[root@server119 ~]# lsof |grep /db2/oracle10g/oradata/rm01.dbf
数据得到完全恢复,并不存在丢失,os结合oracle可以让问题得到更轻松的解决。暂时还没有测试asm文件系统的利用lsof的效果,asm上也仅仅只是因为普通的cp无法使用,应该可以利用oracle的ftp和http功能来上传数据文件到asm磁盘。
所以当数据库出现问题时,如果db是打开的往往是幸运的,而如果马上关闭将很可能引起一系列问题,当然有备份则不用过多担心,在诊断问题时往往需要我们先冷静,然后才去实际处理,现场往往是宝贵的。
[@more@]
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25362835/viewspace-1059232/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/25362835/viewspace-1059232/