事情的经过时这样的,最近在装一台自己的Oracle 11.2.0.4的数据库,应用11.2.0.4.4的GI PSU,我的命令是:
opatch auto /soft/11.2.0.4.4/19380115/ -oh /soft/product/11.2.0.4/gih/,/soft/product/11.2.0.4/dbh -ocmrf /soft/11.2.0.4.4/ocm.rsp
但是在应用GI PSU的过程中,我却总是遇到如下错误,安装包检查不通过:
Stopping CRS...
Stopped CRS successfully
Error : The opatch Applicable check failed. The patch /soft/11.2.0.4.4/19380115/19121549 is not applicable to /soft/product/11.2.0.4/gih
Error:Patch Applicable check failed for /soft/product/11.2.0.4/gih
Starting CRS...
ERROR: Prereq checkApplicable failed. Refer log file for more details.
opatch auto failed.
不论命令怎么修改,这个错误总是如影随行,其实殊不知是因为自己忽略的一些东西而导致的,后面再告诉你是什么。
第一次遇到错误时,看了一下opatch的日志opatchauto2014-11-20_23-10-24.log,结果发现日志存在一下内容:
2014-11-19 00:49:44: Status of Applicable check for /soft/product/11.2.0.4/gih is 1
2014-11-19 00:49:44: Error:Patch Applicable check failed for /soft/product/11.2.0.4/gih
2014-11-19 00:49:44: Executing cmd: /bin/rpm -q sles-release
2014-11-19 00:49:45: Command output:
> package sles-release is not installed
>End Command output
2014-11-19 00:49:45: init file = /soft/product/11.2.0.4/gih/crs/init/init.ohasd
2014-11-19 00:49:45: Copying file /soft/product/11.2.0.4/gih/crs/init/init.ohasd to /etc/init.d directory
2014-11-19 00:49:45: Setting init.ohasd permission in /etc/init.d directory
2014-11-19 00:49:45: init file = /soft/product/11.2.0.4/gih/crs/init/ohasd
2014-11-19 00:49:45: Copying file /soft/product/11.2.0.4/gih/crs/init/ohasd to /etc/init.d directory
2014-11-19 00:49:45: Setting ohasd permission in /etc/init.d directory
2014-11-19 00:49:45: Executing cmd: /bin/rpm -q sles-release
2014-11-19 00:49:45: Command output:
> package sles-release is not installed
>End Command output
说是,我的系统缺失sles-release包。这个事SuSe的版本包呀,我的系统是RedHat的呀,虽然RHEL 6.5里面的redhat-release包被替换成了redhat-release-server-6Server的名字,但是这和它有什么关系呢?
另外一个问题,GI PSU应用失败和这个错误有关系吗?暂时不知道,但是日志里面目前只有这个Error,所以不管三七二十一,既然它报错了就把这个包给补上吧。这个包还真不好找,最后还是在MOS上找到了(ID 1621417.1),其实就是没有任何内容的空的rpm安装包而已(是否可以自己打包一个??)。
本以为安装完这个补丁后问题能够被解决,呵呵,当然还是以应用PSU失败了,只不过现在不再出现找不到sles-release的错误而已。
那么问题到底出在哪呢?
我们都知道其实在使用opatch应用PSU时,它会把每一步操作都输出到:opatchauto2014-11-20_23-10-24.report.log 日志中。
那么我们就把这些命令挨个手工执行一下试试,先从prereq的升级前检查工作开始。注意,crs_home的check使用grid用户,rac_home的check使用oracle用户。
结果发现,当执行到:/soft/product/11.2.0.4/gih/OPatch/opatch prereq CheckApplicable -ph /soft/11.2.0.4.4/19380115/19121549 -oh /soft/product/11.2.0.4/gih 时报错了,而且这正好就是opatch auto报错退出的地方。
那么应该就是和这个地方有关了吧?我们先来看一下报错内容:
Patch 19121549:
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/bin/appvipcfg.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'appvipcfg.pl' to '/soft/product/11.2.0.4/gih/bin/appvipcfg.pl'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/bin/oclumon.bin" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'oclumon.bin' to '/soft/product/11.2.0.4/gih/bin/oclumon.bin'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/bin/ologgerd" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'ologgerd' to '/soft/product/11.2.0.4/gih/bin/ologgerd'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/bin/osysmond.bin" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'osysmond.bin' to '/soft/product/11.2.0.4/gih/bin/osysmond.bin'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/crs/demo/coldfailover/act_db.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_db.pl' to '/soft/product/11.2.0.4/gih/crs/demo/coldfailover/act_db.pl'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/crs/demo/coldfailover/act_listener.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_listener.pl' to '/soft/product/11.2.0.4/gih/crs/demo/coldfailover/act_listener.pl'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/crs/demo/coldfailover/act_resgroup.pl" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'act_resgroup.pl' to '/soft/product/11.2.0.4/gih/crs/demo/coldfailover/act_resgroup.pl'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/crs/demo/demoActionScript" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'demoActionScript' to '/soft/product/11.2.0.4/gih/crs/demo/demoActionScript'
Copy Action: Source File "/soft/11.2.0.4.4/19380115/19121549/files/crs/install/tfa_setup.sh" does not exists or is not readable
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'tfa_setup.sh' to '/soft/product/11.2.0.4/gih/crs/install/tfa_setup.sh'
Copy Action: Directory is not writeable: "/soft/product/11.2.0.4/gih"
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'libsrvm11.so' to '/soft/product/11.2.0.4/gih/oui/lib/linux/libsrvm11.so'
这个错误应就是权限问题,试了一下果然是权限问题,原来是因为grid用户没有权限访问这些文件:
[grid@MHAD2-11g ~]$ cp /soft/11.2.0.4.4/19380115/19121549/files/bin/appvipcfg.pl /tmp
cp: cannot open `/soft/11.2.0.4.4/19380115/19121549/files/bin/appvipcfg.pl' for reading: Permission denied
[grid@MHAD2-11g ~]$ l /soft/11.2.0.4.4/19380115/19121549/files/bin/appvipcfg.pl
-rwxr-x--- 1 root root 9051 Oct 6 18:27 /soft/11.2.0.4.4/19380115/19121549/files/bin/appvipcfg.pl
[grid@MHAD2-11g ~]$
这简直就是低级错误啊,原来因为opatch auto是需要使用root用户执行的,结果我就直接使用root用户区解压PSU的压缩包,然后生成的文件用户和组自然就是root用户的咯,如下:
[root@MHAD2 11.2.0.4.4]# ls -l /soft/11.2.0.4.4/|grep -v zip
total 6108644
drwxr-xr-x 5 root root 4096 Oct 11 10:04 19380115
-rw-r--r-- 1 root root 621 Nov 20 23:09 ocm.rsp
-rw-rw-r-- 1 root root 2186 Oct 15 02:54 PatchSearch.xml
[root@MHAD2 11.2.0.4.4]#
虽然,opatch auto命令必须使用root用户执行,但是其实还是需要通过root用户
su到grid和oracle用户来执行相关的检查工作,自然就会出现权限问题咯。
于是就把19380115目录改成了grid:oinstall用户组的属性,也可以删掉再使用grid用户重新解压文件。
[root@MHAD2 11.2.0.4.4]# chown -R grid:oinstall 19380115
然后,高高兴兴的重新执行opatch auto命令,这次应该没问题了吧。
但是,我还是太天真了,opatch又和我开了个玩笑,再次报错。好吧,有了之前的经验,这次就乖乖地去手工执行这些命令了。
结果,还是:/soft/product/11.2.0.4/gih/OPatch/opatch prereq CheckApplicable -ph /soft/11.2.0.4.4/19380115/19121549 -oh /soft/product/11.2.0.4/gih 报错:
Patch 19121549:
Copy Action: Destination File "/soft/product/11.2.0.4/gih/crs/install/tfa_setup.sh" is not writeable.
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'tfa_setup.sh' to '/soft/product/11.2.0.4/gih/crs/install/tfa_setup.sh'
Copy Action: Directory is not writeable: "/soft/product/11.2.0.4/gih"
'oracle.crs, 11.2.0.4.0': Cannot copy file from 'libsrvm11.so' to '/soft/product/11.2.0.4/gih/oui/lib/linux/libsrvm11.so'
而且,还是权限的问题,哭了都。
第一个文件tfa_setup.sh没有权限是因为,这个文件是使用root用户执行opatch auto时生成的,是root用户的,所以当grid用户再次执行opatch要往这个文件中写东西时自然就没权限了。所以这个文件使用root用户直接删掉或者修改grid:oinstall的用户属组即可。
[root@MHAD2 11.2.0.4]# chown grid:oinstall /soft/product/11.2.0.4/gih/crs/install/tfa_setup.sh
那么第二错误又是怎么回事?grid用户无法写GI HOME?
原来GI安装完后执行root.sh脚本后会把GI HOME的属性用户改成root用户的:
[root@MHAD2 11.2.0.4]# ls -ltr
total 8
drwxr-xr-x 76 oracle oinstall 4096 Nov 20 23:11 dbh
<span style="color:#ff0000;">drwxr-x--- 68 root oinstall 4096 Nov 20 23:14 gih</span>
呵呵,原来这又是我犯的第二个低级错误,好吧先临时把其修改为grid用户的目录:
[root@MHAD2 11.2.0.4]# chown grid gih/
[root@MHAD2 11.2.0.4]# ls -ltr
total 8
drwxr-xr-x 76 oracle oinstall 4096 Nov 20 23:11 dbh
drwxr-x--- 68 grid oinstall 4096 Nov 20 23:14 gih
这次终于可以看到久违的成功信息了:
Stopping CRS...
Stopped CRS successfully
patch /soft/11.2.0.4.4/19380115/19121551 apply successful for home /soft/product/11.2.0.4/gih
patch /soft/11.2.0.4.4/19380115/19121549 apply successful for home /soft/product/11.2.0.4/gih
patch /soft/11.2.0.4.4/19380115/19121552 apply successful for home /soft/product/11.2.0.4/gih
不过,只是部分成功,因为,我装个单机的数据库,木有RAC,所以不能同时写GI HOME和DB HOME:
Stopping RAC /soft/product/11.2.0.4/dbh ...
Failed to stop resources from database home /soft/product/11.2.0.4/dbh
ERROR: Refer log file for more details.
正确的写法是:
opatch auto /soft/11.2.0.4.4/19380115/ -och /soft/product/11.2.0.4/gih/ -oh /soft/product/11.2.0.4/dbh -ocmrf /soft/11.2.0.4.4/ocm.
rsp
指成功应用了GI的PSU,现在只需要手工启动HAS然后,单独为DB HOME应用PSU即可:
[root@MHAD2 11.2.0.4.4]# crsctl start has
CRS-4123: Oracle High Availability Services has been started.
[root@MHAD2 11.2.0.4.4]# opatch auto /soft/11.2.0.4.4/19380115/ -oh /soft/product/11.2.0.4/dbh -ocmrf /soft/11.2.0.4.4/ocm.rsp
Stopping RAC /soft/product/11.2.0.4/dbh ...
Stopped RAC /soft/product/11.2.0.4/dbh successfully
patch /soft/11.2.0.4.4/19380115/19121551 apply successful for home /soft/product/11.2.0.4/dbh
patch /soft/11.2.0.4.4/19380115/19121549/custom/server/19121549 apply successful for home /soft/product/11.2.0.4/dbh
Starting RAC /soft/product/11.2.0.4/dbh ...
Started RAC /soft/product/11.2.0.4/dbh successfully
至此,问题是处理完了,总结一下就是: 权限!权限!还是权限!