今天遇到一个比较神奇的问题,客户某套测试数据库断电重启了,重启时发现数据库提示
ORA-01157: cannot identify/lock data file和ORA-01110的错误,经过检查发现是系统启动后未挂载存储,表空间都放在存储盘上,手工挂载存储后所有问题迎刃而解。当时没有记录问题,这里通过测试环境模拟重现问题。
制造实验数据
[oracle@XLJ181 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Mon Dec 10 19:27:14 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SYS@cams> create tablespace test datafile '/home/oracle/test.dbf' size 100M;
Tablespace created.
SYS@cams> create user test identified by 123456 default tablespace test;
User created.
SYS@cams> grant connect,resource to test;
Grant succeeded.
TEST@cams> create table test(id number primary key,name varchar2(20));
Table created.
TEST@cams> insert into test values(1,'bob');
1 row created.
TEST@cams> insert into test values(2,'joe');
1 row created.
TEST@cams> select count(*) from test;
COUNT(*)
----------
2
TEST@cams> conn / as sysdba
Connected.
SYS@cams> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SYS@cams> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
模拟文件误删除
[oracle@XLJ181 ~]$ mv /home/oracle/test.dbf /home/oracle/test.dbf.bak
故障出现
启动数据库,发现数据文件不存在:
[oracle@XLJ181 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Mon Dec 10 19:38:26 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to an idle instance.
SYS@cams> startup;
ORACLE instance started.
Total System Global Area 5344731136 bytes
Fixed Size 2262656 bytes
Variable Size 1040189824 bytes
Database Buffers 4294967296 bytes
Redo Buffers 7311360 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file
ORA-01110: data file 63: '/home/oracle/test.dbf'
查看trace文件:
Mon Dec 10 19:38:35 2018
ALTER DATABASE OPEN
Errors in file /u01/app/oracle/diag/rdbms/cams/cams/trace/cams_dbw0_21153.trc:
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file
ORA-01110: data file 63: '/home/oracle/test.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /u01/app/oracle/diag/rdbms/cams/cams/trace/cams_ora_21175.trc:
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file
ORA-01110: data file 63: '/home/oracle/test.dbf'
ORA-1157 signalled during: ALTER DATABASE OPEN...
查看
cams_ora_21175.trc文件,报错信息如下:
DDE: Problem Key 'ORA 1110' was flood controlled (0x1) (no incident)
ORA-01110: data file 63: '/home/oracle/test.dbf'
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file
ORA-01110: data file 63: '/home/oracle/test.dbf'
查看
cams_dbw0_21153.trc文件,报错信息如下:
ORA-01157: cannot identify/lock data file 63 - see DBWR trace file
ORA-01110: data file 63: '/home/oracle/test.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
问题已经很明显了,就是找不到
data file 63: '/home/oracle/test.dbf'。
针对该问题,我们应该怎么去处理呢?特别是测试环境,一般为了节约资源,不会开启归档,更不会有RMAN备份,那怎么让数据库跑起来,让数据损失降到最低呢?
常用解决方案:
offline drop+recreate
SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database datafile '/home/oracle/test.dbf' offline drop;
SQL> alter database open;
SQL> drop tablespace test including contents; --注意:执行之前检查是否还有其他文件属于该表空间
SQL> create tablespace test datafile '/home/oracle/test.dbf' size 100M;
因为是测试环境,想办法重建数据或者利用最近的逻辑备份或其他测试导入数据,这样能把数据损失降到最低。
如果删除的是核心系统的表空间,那么还不如重建表空间之后把相关数据清理之后重新导入一份。