[20160718]rman恢复与filesystemio_options

最新推荐文章于 2023-09-25 16:47:04 发布

cuixun7780

最新推荐文章于 2023-09-25 16:47:04 发布

阅读量147

点赞数

[20160718]rman恢复与参数filesystemio_options=setall.txt

--今天在测试环境恢复数据库，遇到许多诡异的问题，做一个记录。

1.恢复环境。

SYS@dbcn1> @ &r/ver
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

2.由于以前这台机器作为测试环境，曾经修改参数filesystemio_options=setall：

--简单对这个参数filesystemio_options做一个说明：

FILESYTEMIO_OPTIONS can be set to one of the following values:

. ASYNCH: enable asynchronous I/O on file system files, which has no timing requirement for transmission.
. DIRECTIO: enable direct I/O on file system files, which bypasses the buffer cache.
. SETALL: enable both asynchronous and direct I/O on file system files.
. NONE: disable both asynchronous and direct I/O on file system files.

这里是否开启文件系统的异步IO主要关乎filesystemio_options参数，该参数根据平台会有一个合适的默认值。

SYS@dbcn1> show parameter filesystem
NAME TYPE VALUE
-------------------- ------ -------
filesystemio_options string none

--可以我以前看一些blog做的测试，设置为filesystemio_options=SETALl.

3.测试中遇到的问题：
--由于备份很大，我插入移动硬盘到我的机器，建立共享文件夹，通过cifs协议mount上本地磁盘。执行如下：

# mount.cifs //192.168.XXX.XXX/w$ /mnt/iso -o username=administrator,ip=192.168.XXX.XXX,uid=oracle,gid=oinstall,file_mode=0644,rsize=32768,wsize=32768

run {
allocate channel c1 type disk;
allocate channel c2 type disk;
allocate channel c3 type disk;
set until time '2015-03-01 18:00:00';
set newname for datafile 1 to '/u01/app/oracle/oradata/dbcn1/datafile/system.308.862160493';
.....
set newname for datafile 43 to '/u01/app/oracle/oradata/dbcn1/datafile/portal_emrca01.dbf';
restore database;
switch datafile all;
recover database;
release channel c1;
release channel c2;
release channel c3;
}

-- 可以报错：
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 07/18/2016 08:34:31
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 4 found to restore
RMAN-06023: no backup or copy of datafile 3 found to restore
RMAN-06023: no backup or copy of datafile 2 found to restore
RMAN-06023: no backup or copy of datafile 1 found to restore

--很奇怪我list backupset显示文件正常，而且一旦报错，备份集变成了expired。

--我采用别的方式取出这几个文件，脚本如下：
$ cat bb.rman
set verify off;
set echo off;
set serveroutput on;
select TO_CHAR(systimestamp,'YYYYMMDD HH:MI:SS') from dual;
variable devicename varchar2(255);
declare
omfname varchar2(512) := NULL;
done boolean;
begin
    dbms_output.put_line(' ');
    dbms_output.put_line(' Allocating device.... ');
    dbms_output.put_line(' Specifying datafiles... ');
       :devicename := dbms_backup_restore.deviceAllocate;
    dbms_output.put_line(' Specifing datafiles... ');
    dbms_backup_restore.restoreSetDataFile;
        dbms_backup_restore.restoreDataFileTo(1, '/u01/app/oracle/oradata/dbcn1/datafile/system.308.862160493', 0, 'SYSTEM');
    dbms_output.put_line(' Restoring ... ');
    dbms_backup_restore.restoreBackupPiece('/mnt/dbcn.backup/fulls/dbcn_DBCN_20150226_784_full', done);
    if done then
        dbms_output.put_line(' Restore done.');
    else
        dbms_output.put_line(' ORA-XXXX: Restore failed ');
    end if;
    dbms_backup_restore.deviceDeallocate;
end;
/

-- 其他数据文件依旧选择上面的方法恢复。当时的感觉恢复非常快，但是我通过dstat观察发现：
# dstat
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in   out | int   csw
0   3 97   0   0   0| 60k 274k|   0     0 | 8.2B   39B|5266    75k
0   7 91   1   0   0|   0   612M|1347B 1088B|   0     0 |2768 3375
0   6 93   1   0   0|   0   547M|1231B 412B|   0     0 |2681 3226
0   2 98   0   0   0|   0   542M| 999B 412B|   0     0 |2772 3288
1   5 92   1   0   0| 16M 394M| 779B 650B|   0     0 |3905 4055
3   8 85   3   0   1| 15M 593M|1134B 412B|   0     0 |7277 6833
0   4 94   0   0   0|   0   606M| 714B 318B|   0     0 |2824 3373
0   2 98   0   0   0|   0   455M| 920B 412B|   0     0 |2633 3067
0   7 91   1   0   1|   0   451M|1181B 412B|   0     0 |2577 3010
0   5 93   1   0   0|   0   438M| 618B 412B|   0     0 |3861 3055
0   4 95   1   0   0|   0   654M|1234B 412B|   0     0 |3564 3509
0   2 98   0   0   0|   0   578M| 970B 412B|   0     0 |3422 3317
1   7 91   1   0   1|   0   584M| 938B 412B|   0     0 |3522 3351

--网络流量很少，写入文件很快，当时并没有在意。可以在restore database阶段时。报错如下：

RMAN-06025: no backup of archived log for thread 2 with sequence 783 and starting SCN of 12305100042 found to restore
RMAN-06025: no backup of archived log for thread 2 with sequence 782 and starting SCN of 12299037705 found to restore
RMAN-06025: no backup of archived log for thread 2 with sequence 781 and starting SCN of 12299037630 found to restore
RMAN-06025: no backup of archived log for thread 2 with sequence 780 and starting SCN of 12296441471 found to restore
...

--我只好手工取出archive log文件，执行如下：
RMAN>restore archivelog low sequence 727 high sequence 727 thread 2 ;
RMAN-06025: no backup of archived log for thread 2 with sequence 769 and starting SCN of 12267340138 found to restore

--依旧不行。仔细检查spfile文件，发现：
SYS@dbcn1> create pfile='/tmp/aaa.ora' from spfile ;

--仔细检查参数文件，发现最大的嫌疑就是设置filesystemio_options='SETALL'。
*.filesystemio_options='SETALL'

--更改为filesystemio_options='ASYNC'后，再次重复上面的命令。一切正常，看来在使用rman+网络方式恢复数据库时要特别注意。
--这个时候再使用dstat观察，显示如下：

# dstat
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in   out | int   csw
0   3 97   0   0   0| 60k 316k|   0     0 | 8.2B   39B|5264    75k
16   1 83   0   0   0|   0   526M| 13M 254k|   0     0 |9397 9352
13   1 86   0   0   0|   0 8200k| 12M 231k|   0     0 |8649 9720
5   0 94   0   0   0|   0    66M| 14M 260k|   0     0 |9286 9626
16   1 84   0   0   0|   0    88M| 12M 225k|   0     0 |9143 9615
19   1 80   0   0   0|   0    62M| 13M 228k|   0     0 |9200    10k
10   1 89   0   0   0|   0   316M| 14M 274k|   0     0 |9552 9706
5   0 94   0   0   0|   0 8192k| 14M 258k|   0     0 |9739    10k
16   1 83   0   0   0|   0    46M| 13M 238k|   0     0 |8111 8447
12   1 87   0   0   0|   0    28M| 12M 225k|   0     0 |9050 9104
15   1 84   0   0   0|   0   130M| 13M 235k|   0     0 |9310 9797
5   0 94   0   0   0|   0   396M| 14M 271k|   0     0 |9483    10k
16   1 83   0   0   0|   0 8232k| 14M 257k|   0     0 |8988 9859
14   1 85   0   0   0|   0    56M| 11M 202k|   0     0 |7686 8201
12   1 87   0   0   0|   0    68M| 12M 220k|   0     0 |9249    10k

--这样就比较正常。并且后面根本不会报错。以下错误不再出现：

RMAN-06023: no backup or copy of datafile 1 found to restore
RMAN-06025: no backup of archived log for thread 2 with sequence 783 and starting SCN of 12305100042 found to restore

--我奇怪的是为什么我手工取数据文件我用dbv检查是正常的。总之以后工作注意。估计设置filesystemio_options='NONE'也应该可以，这个是缺省设置。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/267265/viewspace-2122164/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/267265/viewspace-2122164/