这2天装了一套生产库两个节点的RAC,因为之前的环境上已经有rac了,属于重装。听说之前的库有不小的问题,也没有深入研究而且要的又急,就直接开始重装了。
这也预示了安装上会遇到很多问题。
OS版本:RHEL7.1,ORACLE版本:11.2.0.4
在删除之前的环境上没有遇到什么问题
然后是安装。
1.磁盘绑定与清理
磁盘绑定有多种方式,比如raw,udev,还有oracle官方的asmlib
在安装grid的时候选择磁盘,磁盘用的multipath做了冗余,但是/dev/mapper/data*的权限是root的,oracle无法识别。
所以我这里做了raw的绑定,比如
/bin/raw /dev/raw/raw113 /dev/mapper/data113
chown grid:asmadmin /dev/raw/raw113
然后再/etc/rc.local中写入/bin/raw /dev/raw/raw113 /dev/mapper/data113,设置开机绑定
因为磁盘是使用过的,磁盘状态不是candidate
SQL> select group_number,mount_status,header_status,total_mb,path from v$asm_disk;
GROUP_NUMBER MOUNT_S HEADER_STATU TOTAL_MB PATH
------------ ------- ------------ ---------- ----------------------------------------
0 CLOSED MEMBER 0 /dev/raw/raw102
0 CLOSED FORMER 0 /dev/raw/raw121
0 CLOSED FORMER 0 /dev/raw/raw120
0 CLOSED FORMER 0 /dev/raw/raw118
0 CLOSED FORMER 0 /dev/raw/raw119
0 CLOSED MEMBER 0 /dev/raw/raw103
0 CLOSED MEMBER 0 /dev/raw/raw104
...
这里的磁盘肯定在安装的时候是不可直接使用的,需要清理磁盘
dd if=/dev/zero of=/dev/mapper/data101 bs=1k count=1
bs*count必须>=磁盘头大小,bs的值会直接影响dd的速度
清理了以后就可以在安装过程中选择该磁盘了
2.gpgkey
安装过程中发现未安装elfutils-libelf-devel,使用yum安装失败
检测yum
[root@xsdbd31 ~]# yum repolist
...
repolist: 17,341
查看yum 配置文件
[root@xsdbd31 yum.repos.d]# pwd
/etc/yum.repos.d
[root@xsdbd31 yum.repos.d]# cat redhat7_1.repo
[beta] #
name=redhat-$releasever - beta
baseurl=http://10.174.70.29/redhat/redhat7.1
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-beta
priority=1
gpgkey用的是beta版,不是release
rpm --import file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
然后重新安装即可
3.ACFS-9459, ACFS-9201
在安装grid执行root.sh的时候遇到如下问题
[client(27240)]CRS-10001:19-Jan-18 03:47 ACFS-9459: ADVM/ACFS is not supported on this OS version: 'unknown'
[client(27242)]CRS-10001:19-Jan-18 03:47 ACFS-9201: Not Supported
很明显的错误,该数据库版本和当前OS未认证,检测MOS文档说可以忽略。
虽然是忽略了,但是后续也遇到了不少问题
4.tmp环境变量
[grid@xsdbd31 bin]$ ./cluvfy stage -pre crsinst -n xsdbd31,xsdbd32 -r 11gR2 -verbose
ERROR:
Work area path "tmp/" is invalid. It must be specified as an absolute pathname
刚开始还以为是tmp空间不够,其实报错说明了IT必须是绝对路径
检查grid profile中的变量
export TMP=tmp;
export TMPDIR=$TMP;
更改成
export TMP=/tmp;
export TMPDIR=$TMP;
5.非root用户id为0
在执行root.sh和检测脚本的时候都有如下报错
ERROR:
Unable to obtain network interface list from Oracle ClusterwarePRCT-1011 : Failed to run "oifcfg". Detailed error: null
Verification cannot proceed
刚开始以为是网络问题,检测了网络是正确的,防火墙也关闭了
查询了MOS
Errors PRCR-1006 PRCR-1071 PROC-5 reported during GI installation as multiple user has uid 0 (文档 ID 2012626.1)
有非root用户使用了userid=0,也就是root用户的id
more /etc/passwd
root:x:0:0:root:/root:/bin/bash
ROOT:x:0:0::/ROOT:/bin/bash
这里有个ROOT,不是OS的root用户,询问过主机人员后,修改了ROOT的userid
root:x:0:0:root:/root:/bin/bash
ROOT:x:1212:0::/ROOT:/bin/bash
再次执行root.sh就没有此报错了
6.ohas启动失败
然后集群重建
/grid/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force -verbose
[root@xsdbd31 bin]# /grid/app/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /grid/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow:
2018-01-19 10:27:46.599:
[client(55431)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2018-01-19 10:27:46.600:
[client(55431)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /grid/app/11.2.0/grid/log/xsdbd31/client/crsctl_grid.log.
2018-01-19 10:27:53.478:
[client(55634)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2018-01-19 10:27:53.479:
[client(55634)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /grid/app/11.2.0/grid/log/xsdbd31/client/crsctl_grid.log.
2018-01-19 10:30:52.066:
[client(57088)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2018-01-19 10:30:52.067:
[client(57088)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /grid/app/11.2.0/grid/log/xsdbd31/client/crsctl_grid.log.
2018-01-19 10:30:58.982:
[client(57315)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2018-01-19 10:30:58.983:
[client(57315)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /grid/app/11.2.0/grid/log/xsdbd31/client/crsctl_grid.log.
[client(96625)]CRS-10001:19-Jan-18 11:57 ACFS-9459: ADVM/ACFS is not supported on this OS version: 'unknown'
[client(96627)]CRS-10001:19-Jan-18 11:57 ACFS-9201: Not Supported
2018-01-19 12:04:51.360:
[client(101408)]CRS-2101:The OLR was formatted using version 3.
但是依然报错!
搜索到资料:http://blog.csdn.net/u010692693/article/details/48374557
“因为rhel7使用systemd而不是initd运行进程和重启进程,而root.sh通过传统的initd运行ohasd进程。”
touch /usr/lib/systemd/system/ohas.service
chmod 777 /usr/lib/systemd/system/ohas.service
cat /usr/lib/systemd/system/ohas.service
[Unit]
Description=Oracle High Availability Services
After=syslog.target
[Service]
ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple
Restart=always
[Install]
WantedBy=multi-user.target
[root@xsdbd31 bin]# systemctl daemon-reload
systemctl enable ohas.service
systemctl start ohas.service[root@xsdbd31 bin]# systemctl enable ohas.service
ln -s '/usr/lib/systemd/system/ohas.service' '/etc/systemd/system/multi-user.target.wants/ohas.service'
[root@xsdbd31 bin]# systemctl start ohas.service
[root@xsdbd31 bin]# systemctl status ohas.service
ohas.service - Oracle High Availability Services
Loaded: loaded (/usr/lib/systemd/system/ohas.service; enabled)
Active: active (running) since Fri 2018-01-19 12:58:04 CST; 8s ago
Main PID: 125248 (init.ohasd)
CGroup: /system.slice/ohas.service
└─125248 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple
Jan 19 12:58:04 xsdbd31 systemd[1]: Started Oracle High Availability Services
再执行root.sh
/grid/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig -force -verbose
[root@xsdbd31 bin]# /grid/app/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /grid/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-2672: Attempting to start 'ora.mdnsd' on 'xsdbd31'
CRS-2676: Start of 'ora.mdnsd' on 'xsdbd31' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'xsdbd31'
CRS-2676: Start of 'ora.gpnpd' on 'xsdbd31' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'xsdbd31'
CRS-2672: Attempting to start 'ora.gipcd' on 'xsdbd31'
CRS-2676: Start of 'ora.cssdmonitor' on 'xsdbd31' succeeded
CRS-2676: Start of 'ora.gipcd' on 'xsdbd31' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'xsdbd31'
CRS-2672: Attempting to start 'ora.diskmon' on 'xsdbd31'
CRS-2676: Start of 'ora.diskmon' on 'xsdbd31' succeeded
CRS-2676: Start of 'ora.cssd' on 'xsdbd31' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'xsdbd31'
CRS-2676: Start of 'ora.asm' on 'xsdbd31' succeeded
CRS-2672: Attempting to start 'ora.OCR.dg' on 'xsdbd31'
CRS-2676: Start of 'ora.OCR.dg' on 'xsdbd31' succeeded
Preparing packages...
cvuqdisk-1.0.9-1.x86_64
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
总算成功了一个节点
再来2节点
root@xsdbd32 xsdbd32]# systemctl daemon-reload
[root@xsdbd32 xsdbd32]# systemctl start ohas.service
[root@xsdbd32 xsdbd32]# systemctl status ohas.service
ohas.service - Oracle High Availability Services
Loaded: loaded (/usr/lib/systemd/system/ohas.service; disabled)
Active: failed (Result: start-limit) since Fri 2018-01-19 13:40:39 CST; 3s ago
Process: 130289 ExecStart=/etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple (code=exited, status=203/EXEC)
Main PID: 130289 (code=exited, status=203/EXEC)
Jan 19 13:40:39 xsdbd32 systemd[1]: Unit ohas.service entered failed state.
Jan 19 13:40:39 xsdbd32 systemd[1]: ohas.service holdoff time over, scheduling restart.
Jan 19 13:40:39 xsdbd32 systemd[1]: Stopping Oracle High Availability Services...
Jan 19 13:40:39 xsdbd32 systemd[1]: Starting Oracle High Availability Services...
Jan 19 13:40:39 xsdbd32 systemd[1]: ohas.service start request repeated too quickly, refusing to start.
Jan 19 13:40:39 xsdbd32 systemd[1]: Failed to start Oracle High Availability Services.
Jan 19 13:40:39 xsdbd32 systemd[1]: Unit ohas.service entered failed state.
2节点报错了,root.sh也失败
仍然是上面网址的解决方法,在生成init.ohasd后,立即执行systemctl start ohas.service
[root@xsdbd32 xsdbd32]# ls /etc/init.d/init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
窗口1执行root.sh
窗口2,果然等到了init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
[root@xsdbd32 ~]# ls /etc/init.d/init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
[root@xsdbd32 ~]# ls /etc/init.d/init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
[root@xsdbd32 ~]# ls /etc/init.d/init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
[root@xsdbd32 ~]# ls /etc/init.d/init.ohasd
ls: cannot access /etc/init.d/init.ohasd: No such file or directory
[root@xsdbd32 ~]# ls /etc/init.d/init.ohasd
/etc/init.d/init.ohasd
[root@xsdbd32 ~]# systemctl start ohas.service
[root@xsdbd32 ~]# systemctl status ohas.service
ohas.service - Oracle High Availability Services
Loaded: loaded (/usr/lib/systemd/system/ohas.service; disabled)
Active: active (running) since Fri 2018-01-19 13:49:34 CST; 12s ago
Main PID: 136574 (init.ohasd)
CGroup: /system.slice/ohas.service
└─136574 /bin/sh /etc/init.d/init.ohasd run >/dev/null 2>&1 Type=simple
Jan 19 13:49:34 xsdbd32 systemd[1]: Starting Oracle High Availability Services...
Jan 19 13:49:34 xsdbd32 systemd[1]: Started Oracle High Availability Services.
窗口1
[root@xsdbd32 xsdbd32]# /grid/app/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /grid/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to inittab
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node xsdbd31, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Preparing packages...
cvuqdisk-1.0.9-1.x86_64
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
开森!
[root@xsdbd32 xsdbd32]# /grid/app/11.2.0/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....N1.lsnr ora....er.type ONLINE ONLINE xsdbd31
ora.OCR.dg ora....up.type ONLINE ONLINE xsdbd31
ora.asm ora.asm.type ONLINE ONLINE xsdbd31
ora.cvu ora.cvu.type ONLINE ONLINE xsdbd31
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE xsdbd31
ora.oc4j ora.oc4j.type ONLINE ONLINE xsdbd31
ora.ons ora.ons.type ONLINE ONLINE xsdbd31
ora.scan1.vip ora....ip.type ONLINE ONLINE xsdbd31
ora....SM1.asm application ONLINE ONLINE xsdbd31
ora....d31.gsd application OFFLINE OFFLINE
ora....d31.ons application ONLINE ONLINE xsdbd31
ora....d31.vip ora....t1.type ONLINE ONLINE xsdbd31
ora....SM2.asm application ONLINE ONLINE xsdbd32
ora....d32.gsd application OFFLINE OFFLINE
ora....d32.ons application ONLINE ONLINE xsdbd32
ora....d32.vip ora....t1.type ONLINE ONLINE xsdbd32
7.安装db时ins_emagent.mk报错
安装db软件
http://blog.csdn.net/halley333/article/details/53941859
安装过程中报错
$ORACLE_HOME/sysman/lib/ins_emagent.mk 出错
解决办法:更改报错的文件
$(MK_EMAGENT_NMECTL)
替换为
$(MK_EMAGENT_NMECTL) -lnnz11
然后retry通过
rac安装过很多次了,遇到这么多报错也是第一次,花了很多时间和精力总算解决了。
以后还是要好好检查环境,一定要检查认证情况。