环境:Oracle 11.2, CEntOS 6.6, VMware
原因:磁盘I/O性能低效,VMware虚拟机备份占用大量磁盘I/O吞吐能力
现象:最后的alert日志内容
Fri Feb 03 17:38:57 2017
********************* ATTENTION: ********************
The controlfile header block returned by the OS
has a sequence number that is too old.
The controlfile might be corrupted.
PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
without following the steps below.
RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
TO THE DATABASE, if the controlfile is truly corrupted.
In order to re-start the instance safely,
please do the following:
(1) Save all copies of the controlfile for later
analysis and contact your OS vendor and Oracle support.
(2) Mount the instance and issue:
ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
(3) Unmount the instance.
(4) Use the script in the trace file to
RE-CREATE THE CONTROLFILE and open the database.
*****************************************************
USER (ospid: 30341): terminating the instance
Fri Feb 03 17:38:58 2017
System state dump requested by (instance=1, osid=30341 (PR00)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/wmsdst/WMSDI/trace/WMSDI_diag_30193_20170203173858.trc
Dumping diagnostic data in directory=[cdmp_20170203173858], requested by (instance=1, osid=30341 (PR00)), summary=[abnormal instance termination].
Instance terminated by USER, pid = 30341
问题处理:定位并消除磁盘性能影响因素,查询并按照MOS上的最新解决方案处理。
简单处理:请先备份每一个在用的控制文件、因为我们并不知道哪一个是最新的。
然后可以startup mount;看看哪一个控制文件的sequence是最新的,复制覆盖其他未同步的控制文件。
至于MOS上的BUG 14281768 - CONTROL FILE GETS CORRUPTEDOLUTION尽可能不要这么处理,除非在限定短时间内找不到磁盘I/O性能为何低效时临时处理。
Error is typically raised when the Controlfile is overwritten by an older copy of the Controlfile. Most likely this happened due to Storage OR I/o error.
All copies of the control file must have the same internal sequence number for Oracle to start up the database or shut it down in normal or immediate mode.
To make a sanity check in the future , please set the following parameter :-
SQL> alter system set "_controlfile_update_check"='HIGH' scope=spfile; -- then bounce the database.
Please check with your OS System/Storage admin regarding the issue.
The precautions is to relocate the control file on a fast and direct I/O enabled disk , the main target is not letting the OS to write an old copy (cached copy of the controlfile to it).
To reverse the parameter setting :-
SQL> alter system set "_controlfile_update_check"='OFF' scope=spfile; -- then bounce the database.
更多问题触发场景:
VMware上的Oracle数据库最佳实践指南 54页
12. Backup and Recovery
12.1 Oracle Backup and Recovery Overview
The main purpose of a database backup and recovery strategy is to protect the database against data loss and reconstruct the database after data loss. Typical backup tasks performed by an Oracle DBA would include setting up the database environment for backup and recovery, setting up a backup schedule, monitoring the backup and recovery environment, and troubleshooting backup problems
A backup can be either a physical or a logical backup. Physical backups are physical copies of the database files which include data files, control files, and archive log files. Logical backups contain a logical copy of the data, such as tables, indexes, procedures, functions, and so on. You can use Oracle Data Pump to export logical data to binary files, which you can later import into the database.
There are levels of triggering Oracle database backups within the VMware environment:
* In guest Oracle backup using Oracle Recovery Manager (RMAN)
* VMware level backup using VMware vSphere Data Protection™ / VMware vSphere Data Protection Advanced
* Storage based backup tools
* vSphere Virtual Volumes using vSphere 6.0
vSphere recommends either using Oracle Recovery Manager (RMAN), storage-based backup tools, or vSphere Virtual Volumes using vSphere 6.0.
12.2 Oracle Recovery Manager (RMAN)
For implementing an effective Oracle database backup and recovery strategy, Oracle Recovery Manager (RMAN) is typically the preferred Oracle solution.
RMAN provides a common interface for backup tasks across different host operating systems, and offers several backup techniques not available through user-managed methods.
The method of deploying and using RMAN to backup an Oracle database does not change when virtualizing an Oracle database. It is same across both physical and virtualized environments.
For more information on Oracle Recovery Manager, see the Oracle documentation at https://docs.oracle.com/database/121/BRADV/toc.htm.
12.3 vSphere Data Protection
Any virtual machine VMDK can be backed up with VMware snapshot technology as long as it is not set to Independent-Persistent mode.
A virtual machine housing an Oracle database has two types of VMDKs—guest OS VMDK and the VMDKs housing the Oracle data files.
VMware does not recommend that you back up a high transactional, heavy I/O-centric Oracle database using VMware snapshot technology because, during the snapshot removal (consolidation), there is a brief stun moment. No activity is permitted against the virtual machine, which might result in performance issue and service disruptions.
For more information, see A snapshot removal can stop a virtual machine for long time (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002836).
You can, however, back up Oracle non-production databases (development, test, QA, pre-production, and so on) using VMware snapshot technology.
此文地址:http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/vmware-oracle-databases-on-vmware-best-practices-guide.pdf