AIX7.1创建数据库实例报错:PRCR-1079 CRS-2674 CRS-2632
报错:
生产环境有一个aix7.1操作系统,安装oracle11gRAC,数据库安装了21年7月的补丁,
但是在dbca创建数据库96%时报错:没有其他服务器可以尝试放置资源ora.xxxx.db以满足其放置策略
就是数据库服务启动失败
一、权限问题:
1、尝试手动启动数据库服务:
$ srvctl start db -d <RACDB>
PRCR-1079 : Failed to start resource ora.<RACDB>.db
ORA-01031: insufficient privileges
ORA-01031: insufficient privileges
CRS-2674: Start of 'ora.<RACDB>.db' on '<NODE1>' failed
可能的原因:
1. Grid Infrastructure 所有者不是正在启动的数据库的 OSDBA 组的一部分。
2. Grid Infrastructure 所有者没有对数据库 dbs ( $ORACLE_HOME/dbs ) 目录的写权限。
查看grid用户id信息
$ id grid
uid=500(grid) gid=503(oinstall) groups=506(asmadmin),508(asmdba),509(asmoper)
检查之后发现grid用户确实不在dba组中
2、grid用户添加到dba组
1、将grid用户添加至dba组
# chuser pgrp=dba grid
2、查看grid id发现gid更改为了dba,更改回gid=503(oinstall)
# chuser pgrp=oinstall grid
3、再次查看grid用户id
$ id grid
uid=500(grid) gid=503(oinstall) groups=504(dba),506(asmadmin),508(asmdba),509(asmoper)
若只是权限问题基本已经解决
二、psu-bug导致数据库实例启动失败
1、启动数据库
$ srvctl start db -d <RACDB>
PRCR-1079 : Failed to start resource ora.<RACDB>.db
CRS-2674: Start of 'ora.<RACDB>.db' on '<NODE1>' failed
CRS-2674: Start of 'ora.<RACDB>.db' on '<NODE2>' failed
CRS-2632: There are no more servers to try to place resource 'ora.<RACDB>.db' on that would satisfy its placement policy
2、查看数据库alert日志:
alert日志路径:
$ORACLE_BASE/diag/rdbms/database_name/instance_name/trace/alert_instance.log
Tue Sep 28 11:19:15 2021
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Initial number of CPU is 16
Number of processor cores in the system is 4
2021-09-28 11:19:15.505:
[USER(9306190)]CRS-2316:Fatal error: cannot initialize GPnP, CLSGPNP_ERR (Generic GPnP error).
kggpnpInit: failed to init gpnp
WARNING: No cluster interconnect has been specified. Depending on
the communication driver configured Oracle cluster traffic
may be directed to the public interface of this machine.
Oracle recommends that RAC clustered databases be configured
with a private interconnect for enhanced security and
performance.
SMR is corrupted and will be recreated. All Health Check clients should disconnect and reconnect to the instance.
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/oracle/product/11.2.0/db_1/dbs/arch
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options.
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name: AIX
Node name: gt3-sky-db2
Release: 1
Version: 7
Machine: 00F882F14C00
Using parameter settings in server-side pfile /u01/app/oracle/product/11.2.0/db_1/dbs/initsdltsky2.ora
System parameters with non-default values:
processes = 1500
sessions = 2272
spfile = "+DG_DATA/sdltsky/spfilesdltsky.ora"
sga_target = 25G
control_files = "+DG_DATA/sdltsky/controlfile/current.256.1084443507"
db_block_size = 8192
compatible = "11.2.0.4.0"
cluster_database = TRUE
db_create_file_dest = "+DG_DATA"
thread = 2
undo_tablespace = "UNDOTBS2"
instance_number = 2
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
remote_listener = "gt3-sky-db-scanip:1521"
audit_file_dest = "/u01/app/oracle/admin/sdltsky/adump"
audit_trail = "DB"
db_name = "sdltsky"
open_cursors = 300
pga_aggregate_target = 10G
diagnostic_dest = "/u01/app/oracle"
Cluster communication is configured to use the following interface(s) for this instance
140.12.52.82
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Tue Sep 28 11:19:16 2021
WARNING: process PMON (ospid: 8782080) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
PMON started with pid=2, OS id=8782080
Tue Sep 28 11:19:16 2021
WARNING: process PSP0 (ospid: 9109698) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
PSP0 started with pid=3, OS id=9109698
Tue Sep 28 11:19:17 2021
WARNING: process VKTM (ospid: 9044014) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
VKTM started with pid=4, OS id=9044014 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Tue Sep 28 11:19:17 2021
WARNING: process GEN0 (ospid: 8651102) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
GEN0 started with pid=5, OS id=8651102
Tue Sep 28 11:19:17 2021
WARNING: process DIAG (ospid: 4587720) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
DIAG started with pid=6, OS id=4587720
Tue Sep 28 11:19:17 2021
WARNING: process DBRM (ospid: 8454202) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
DBRM started with pid=7, OS id=8454202
Tue Sep 28 11:19:17 2021
WARNING: process PING (ospid: 6619406) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
PING started with pid=8, OS id=6619406
Tue Sep 28 11:19:17 2021
WARNING: process ACMS (ospid: 8716414) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
ACMS started with pid=9, OS id=8716414
Tue Sep 28 11:19:17 2021
WARNING: process DIA0 (ospid: 8585570) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
DIA0 started with pid=10, OS id=8585570
Tue Sep 28 11:19:17 2021
WARNING: process LMON (ospid: 9240624) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LMON started with pid=11, OS id=9240624
Tue Sep 28 11:19:17 2021
WARNING: process LMD0 (ospid: 8323496) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LMD0 started with pid=12, OS id=8323496
* System load used for high load check
* New Low - High Load Threshold Range = [55296 - 73728]
Tue Sep 28 11:19:17 2021
WARNING: process LMS0 (ospid: 8257856) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LMS0 started with pid=13, OS id=8257856 at elevated priority
Tue Sep 28 11:19:17 2021
WARNING: process LMS1 (ospid: 8192280) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LMS1 started with pid=14, OS id=8192280 at elevated priority
Tue Sep 28 11:19:17 2021
WARNING: process RMS0 (ospid: 7930342) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
RMS0 started with pid=15, OS id=7930342
Tue Sep 28 11:19:17 2021
WARNING: process LMHB (ospid: 8650764) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LMHB started with pid=16, OS id=8650764
Tue Sep 28 11:19:17 2021
WARNING: process MMAN (ospid: 8257610) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
MMAN started with pid=17, OS id=8257610
Tue Sep 28 11:19:17 2021
WARNING: process DBW0 (ospid: 7864620) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
DBW0 started with pid=18, OS id=7864620
Tue Sep 28 11:19:17 2021
WARNING: process DBW1 (ospid: 8520008) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
DBW1 started with pid=19, OS id=8520008
Tue Sep 28 11:19:18 2021
WARNING: process LGWR (ospid: 8126690) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LGWR started with pid=20, OS id=8126690
Tue Sep 28 11:19:18 2021
WARNING: process CKPT (ospid: 7864414) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
CKPT started with pid=21, OS id=7864414
Tue Sep 28 11:19:18 2021
WARNING: process SMON (ospid: 8323192) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
SMON started with pid=22, OS id=8323192
Tue Sep 28 11:19:18 2021
WARNING: process RECO (ospid: 7995614) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
RECO started with pid=23, OS id=7995614
Tue Sep 28 11:19:18 2021
WARNING: process RBAL (ospid: 7537034) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
RBAL started with pid=24, OS id=7537034
Tue Sep 28 11:19:18 2021
WARNING: process ASMB (ospid: 7799038) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
ASMB started with pid=25, OS id=7799038
Tue Sep 28 11:19:18 2021
WARNING: process MMON (ospid: 8519750) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
MMON started with pid=26, OS id=8519750
Tue Sep 28 11:19:18 2021
WARNING: process MMNL (ospid: 7471376) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
MMNL started with pid=27, OS id=7471376
NOTE: initiating MARK startup
Starting background process MARK
Tue Sep 28 11:19:18 2021
WARNING: process MARK (ospid: 8061148) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
MARK started with pid=28, OS id=8061148
NOTE: MARK has subscribed
lmon registered with NM - instance number 2 (internal mem no 1)
Reconfiguration started (old inc 0, new inc 4)
List of instances:
1 2 (myinst: 2)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* domain 0 valid = 0 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Tue Sep 28 11:19:22 2021
WARNING: process LCK0 (ospid: 7012608) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
LCK0 started with pid=30, OS id=7012608
Starting background process RSMN
Tue Sep 28 11:19:23 2021
WARNING: process RSMN (ospid: 8912918) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
RSMN started with pid=31, OS id=8912918
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /u01/app/oracle
Tue Sep 28 11:19:24 2021
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=140.12.52.86)(PORT=1521))' SCOPE=MEMORY SID='sdltsky2';
ALTER DATABASE MOUNT /* db agent *//* {2:37398:1620} */
WARNING: process USER (ospid: 6881580) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
NOTE: Loaded library: System
SUCCESS: diskgroup DG_DATA was mounted
NOTE: dependency between database sdltsky and diskgroup resource ora.DG_DATA.dg is established
Tue Sep 28 11:19:31 2021
Successful mount of redo thread 2, with mount id 3414003449
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT /* db agent *//* {2:37398:1620} */
ALTER DATABASE OPEN /* db agent *//* {2:37398:1620} */
Picked broadcast on commit scheme to generate SCNs
Thread 2 opened at log sequence 4
Current log# 1204 seq# 4 mem# 0: +DG_DATA/sdltsky/onlinelog/group_1204.260.1084444137
Successful open of redo thread 2
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Sep 28 11:19:34 2021
SMON: enabling cache recovery
[6881580] Successfully onlined Undo Tablespace 4.
Undo initialization finished serial:0 start:888712992 end:888713233 diff:241 (2 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
Tue Sep 28 11:19:35 2021
No Resource Manager plan active
Tue Sep 28 11:19:36 2021
minact-scn: Inst 2 is a slave inc#:4 mmon proc-id:8519750 status:0x2
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
Starting background process GTX0
Tue Sep 28 11:19:36 2021
WARNING: process GTX0 (ospid: 7405804) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
GTX0 started with pid=35, OS id=7405804
Starting background process RCBG
Tue Sep 28 11:19:36 2021
WARNING: process RCBG (ospid: 7602242) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
RCBG started with pid=37, OS id=7602242
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Tue Sep 28 11:19:37 2021
WARNING: process QMNC (ospid: 7012512) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
QMNC started with pid=39, OS id=7012512
Completed: ALTER DATABASE OPEN /* db agent *//* {2:37398:1620} */
Tue Sep 28 11:19:43 2021
Shutting down instance (abort)
License high water mark = 2
USER (ospid: 7340342): terminating the instance
WARNING: process USER (ospid: 7340342) was unable to attach SMR.
SMR is corrupted. Shut down and restart the instance to recreate it.
Instance terminated by USER, pid = 7340342
Tue Sep 28 11:19:46 2021
Instance shutdown complete
分析:
查看数据库alert日志发现主要警告一个问题:
WARNING: process USER (ospid: 7340342) was unable to attach SMR.
SMR已损坏,请关闭并重新启动实例以重新创建它
在metalink上搜索到可能是因为bug引起,(文档 ID 2732507.1)
在 AIX7 中应用 2020 年 10 月 DBPSU 后,SMR 文件不断损坏导致实例不能正常启动。
解决:
安装补丁32109594 修复这个问题
安装步骤如下,详情参阅补丁README
1. Apply Patch to Grid Home as GI home owner :
$ <GI_HOME>/OPatch/opatch apply -oh <GI_HOME> -local <PATCH_TOP_DIR>/32109594
2. Apply Patch to DB home(s) as DB home owner :
$ <ORACLE_HOME>/OPatch/opatch apply -oh <ORACLE_HOME> -local <PATCH_TOP_DIR>/32109594
3. Verify whether the patch has been successfully installed by running the following command:
As the Oracle Grid Infrastructure owner, run the following command:
$ opatch lsinventory -oh <GI_HOME>
As the Oracle Database home owner, run the following command:
$ opatch lsinventory -oh <ORACLE_HOME>
4. Run the post script as a root user from the Oracle home of Oracle Grid Infrastructure:
- Running the script for Grid Infrastructure home in Clustered Environments (Run as root)
# <GI_HOME>/crs/install/rootcrs.pl -patch
- Running the script for Grid Infrastructure home on Standalone Servers or Non-Clustered Environments (Run as root)
# <GI_HOME>/crs/install/roothas.pl -patch
注意:如果您将此补丁应用到Database主目录,则跳过此步骤。
3、启动数据库
#使用sqlplus启动数据库
$ sqlplus / as sysdba
SQL> startup
ORACLE instance started.
Total System Global Area 2.6724E+10 bytes
Fixed Size 2258992 bytes
Variable Size 3154118608 bytes
Database Buffers 2.3555E+10 bytes
Redo Buffers 12107776 bytes
ORA-00205: error in identifying control file, check alert log more info
SQL>
4、查看数据库alert日志
主要报错部分如下:
...
ALTER DATABASE MOUNT
This instance was first to mount
NOTE: Loaded library: System
ORA-15025: could not open disk "/dev/rhdiskpower3"
ORA-27041: unable to open file
IBM AIX RISC System/6000 Error: 13: Permission denied
Additional information: 11
SUCCESS: diskgroup DG_DATA was dismounted
ERROR: diskgroup DG_DATA was not mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DG_DATA/sdltsky/controlfile/current.256.1084443507'
ORA-17503: ksfdopn:2 Failed to open file +DG_DATA/sdltsky/controlfile/current.256.1084443507
ORA-15001: diskgroup "DG_DATA" does not exist or is not mounted
ORA-15040: diskgroup is incomplete
ORA-205 signalled during: ALTER DATABASE MOUNT...
分析:
安装完补丁后,$ORACLE_HOME/bin/oracle
文件权限变动,导致不能读取asm共享存储,+DG_DATA处于dismount状态,其中的控制文件也不能打开,所以数据库无法启动
解决:
1、查看$ORACLE_HOME/bin/oracle文件权限
$ cd $ORACLE_HOME/bin
$ ls -ls oracle
309988 -rwsr-s--x 1 oracle oinstall 317346191 Sep 28 13:12 oracle
2、该文件的属主应该为oracle:asmadmin
# chown oracle:asmadmin oracle
#读写执行的权限代码:6751
5、再次启动数据库
$ srvctl start db -d <RACDB>
启动无报错
6、查看集群状态
$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DG_DATA.dg ora....up.type ONLINE ONLINE gt3-sky-db1
ora.DG_GRID.dg ora....up.type ONLINE ONLINE gt3-sky-db1
ora....ER.lsnr ora....er.type ONLINE ONLINE gt3-sky-db1
ora....N1.lsnr ora....er.type ONLINE ONLINE gt3-sky-db1
ora.asm ora.asm.type ONLINE ONLINE gt3-sky-db1
ora.cvu ora.cvu.type ONLINE ONLINE gt3-sky-db1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....SM1.asm application ONLINE ONLINE gt3-sky-db1
ora....B1.lsnr application ONLINE ONLINE gt3-sky-db1
ora....db1.gsd application OFFLINE OFFLINE
ora....db1.ons application ONLINE ONLINE gt3-sky-db1
ora....db1.vip ora....t1.type ONLINE ONLINE gt3-sky-db1
ora....SM2.asm application ONLINE ONLINE gt3-sky-db2
ora....B2.lsnr application ONLINE ONLINE gt3-sky-db2
ora....db2.gsd application OFFLINE OFFLINE
ora....db2.ons application ONLINE ONLINE gt3-sky-db2
ora....db2.vip ora....t1.type ONLINE ONLINE gt3-sky-db2
ora....network ora....rk.type ONLINE ONLINE gt3-sky-db1
ora.oc4j ora.oc4j.type ONLINE ONLINE gt3-sky-db1
ora.ons ora.ons.type ONLINE ONLINE gt3-sky-db1
ora....ry.acfs ora....fs.type ONLINE ONLINE gt3-sky-db1
ora.scan1.vip ora....ip.type ONLINE ONLINE gt3-sky-db1
ora.sdltsky.db ora....se.type ONLINE ONLINE gt3-sky-db1
集群数据库状态正常,问题解决。