一.从新构建操作系统
磁盘划分:
预留20g空白分区:(一般在一台机器上做就可以,用这个空闲分区做共享设备)
/ 30G
/boot 100m
/usr 10gG
/swap 5G
安装包的时候将开发工具选中!
关闭下面服务:
防火墙(iptables)
selinux
send-mail
syslog
主机名和ip:
这里机器如果有两块网卡就分别设置IP;
如果没有双网卡则将ifcfg-eth0复制叫ifcfg-eth0:2,再配置IP
节点1--
主机名:stu10
eth0 ip :192.168.1.10
eht1 ip:10.0.0.10
节点2--
主机名:stu12
eth0 ip:192.168.1.12
eth1 ip:10.0.0.12
向两台主机的/etc/hosts文件添加:
192.168.1.10 stu10
10.0.0.10 stu10-priv
192.168.1.11 stu10-vip
192.168.1.12 stu12
10.0.0.12 stu12-priv
192.168.1.13 stu12-vip
stu10/20:public IP对外提供服务的ip地址。(找主机的)
stu10/20-priv:私有IP,内网使用光纤连接,传播心跳使用.(集群协同操作的,心跳串起来的就是一个群,是私有的。节点之间传送数据也是走私有IP)
stu10/20-vip:虚拟IP,某个节点崩溃后,该节点的vip就会飘移到其他节点,由其他节点提供该节点所提供的服务.(找数据库的)
二.iscsi共享磁盘配置:
1.软件包安装:
服务端:ClusterStorage目录下
rpm -ivh perl-Config-General-2.40-1.el5.noarch.rpm --依赖包
rpm -ivh scsi-target-utils-0.0-5.20080917snap.el5.x86_64.rpm --服务器端程序
rpm -ivh iscsi-initiator-utils-6.2.0.871-0.10.el5.x86_64.rpm --服务器也安装一下客户端程序,以方便排查错误
*iscsi服务端的rpm包千万不要在客户端安装!
客户端:Server目录下
rpm -ivh iscsi-initiator-utils-6.2.0.871-0.10.el5.x86_64.rpm
查看包是否成功安装:
[root@top-33 opt]# rpm -qa | grep scsi
scsi-target-utils-0.0-5.20080917snap.el5
iscsi-initiator-utils-6.2.0.871-0.10.el5
2.创建磁盘设备:
做系统的时候没有预留空间的话只能dd出空间再作了.
dd if=/dev/zero f=/iscsi/disk1 bs=1M count=4096
有空闲空间则直接划分出10G的盘出来:
fdisk /dev/hdb
-->n
-->
-->+10G
--> w
3.服务器端iscsi盘配置:192.168.1.12
编辑iscsi服务器端配置文件:
vi /etc/tgt/targets.conf
----------------------------------------
backing-store /iscsi/disk1
initiator-address 192.168.1.0/24
----------------------------------------
vim /etc/udev/scripts/iscsidev.sh
----------------------------------------
#!/bin/bash
BUS=${1}
HOST=${BUS%%:*}
[ -e /sys/class/iscsi_host ] || exit 1
file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"
target_name=$(cat ${file})
if [ -z "${target_name}" ] ; then
exit 1
fi
echo "${target_name##*:}"
----------------------------------------
chmod +x /etc/udev/scripts/iscsidev.sh
vi /etc/udev/rules.d/55-openiscsi.rules --服务器和客户端的该文件内容要一致!
-----------------------------------------------
KERNEL=="sd*",BUS=="scsi",PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c"
查看iscsi相关服务的启动加载状态:
[root@top-33 opt]# chkconfig --list iscsi
iscsi 0:关闭 1:关闭 2:关闭 3:启用 4:启用 5:启用 6:关闭
[root@top-33 opt]# chkconfig --list iscsid
iscsid 0:关闭 1:关闭 2:关闭 3:启用 4:启用 5:启用 6:关闭
[root@top-33 opt]# chkconfig --list tgtd
tgtd 0:关闭 1:关闭 2:关闭 3:关闭 4:关闭 5:关闭 6:关闭
[root@top-33 opt]# chkconfig tgtd on
[root@top-33 opt]# chkconfig --list tgtd
tgtd 0:关闭 1:关闭 2:启用 3:启用 4:启用 5:启用 6:关闭
查看iscsi相关服务的当前运行状态:
[root@top-33 opt]# service iscsi status
iscsid 已停
[root@top-33 opt]# service iscsid status
iscsid 已停
[root@top-33 opt]# service tgtd status
tgtd 已停
启动iscsi守护进程和相关服务:
service iscsid start
service tgtd start
[root@top-33 opt]# service iscsi status
iscsid (pid 3108) 正在运行...
[root@top-33 opt]# service iscsid status
iscsid (pid 3108) 正在运行...
[root@top-33 opt]# service tgtd status
tgtd (pid 3144 3143) 正在运行...
在服务器端本地测试共享盘是否可以发现:
iscsiadm -m discovery -t sendtargets -p 192.168.1.33
启动iscsi服务:
service iscsi start
对共享盘授权
iscsiadm -m node -L all
使用分区命令查看iscsi共享磁盘是否存在:
[root@top-33 opt]# fdisk -l
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 33 265041 83 Linux
/dev/sda2 34 5255 41945715 83 Linux
/dev/sda3 5256 7866 20972857+ 83 Linux
/dev/sda4 7867 30401 181012387+ 5 Extended
/dev/sda5 7867 8388 4192933+ 82 Linux swap / Solaris
/dev/sda6 8389 8649 2096451 83 Linux
Disk /dev/sdb: 4294 MB, 4294967296 bytes
133 heads, 62 sectors/track, 1017 cylinders
Units = cylinders of 8246 * 512 = 4221952 bytes
Disk /dev/sdb doesn't contain a valid partition table
4.客户端iscsi盘配置:192.168.1.10
vi /etc/udev/rules.d/55-openiscsi.rules
------------------------------------------------
KERNEL=="sd*",BUS=="scsi",PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c"
------------------------------------------------
vi /etc/udev/scripts/iscsidev.sh
------------------------------------------------
#!/bin/bash
BUS=${1}
HOST=${BUS%%:*}
[ -e /sys/class/iscsi_host ] || exit 1
file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"
target_name=$(cat ${file})
if [ -z "${target_name}" ] ; then
exit 1
fi
echo "${target_name##*:}"
-----------------------------------------------
chmod +x /etc/udev/scripts/iscsidev.sh
在客户端扫描iscsi服务器(注意这里要两次启动iscsi服务!)
service iscsi start
iscsiadm -m discovery -t sendtargets -p 192.168.1.33
service iscsi start
在客户端使用分区命令查看共享盘是否加载
[root@topplus opt]# fdisk -l
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 2857 22948821 7 HPFS/NTFS
/dev/sda2 2858 17139 114720165 f W95 Ext'd (LBA)
/dev/sda3 17140 30115 104229720 83 Linux
/dev/sda4 30116 30401 2297295 82 Linux swap / Solaris
/dev/sda5 2858 17139 114720133+ 7 HPFS/NTFS
Disk /dev/sdb: 4294 MB, 4294967296 bytes
133 heads, 62 sectors/track, 1017 cylinders
Units = cylinders of 8246 * 512 = 4221952 bytes
Disk /dev/sdb doesn't contain a valid partition table
5.创建裸设备:
共享磁盘加载成功!对共享盘/dev/sdb进行分区(注意分区格式,要创建扩展分区,方便以后扩展。如果安装软件之后就不能动ORC和 vot了,只能备份之后还原。如果没有数据文件,则可以修改asmdisk)
每个实例都会将OCR和votDISK信息放到本地,但是集群运行过程中要时时读写这两块盘,一旦他们崩溃集群就会崩溃了!
/dev/sdb1 1 48 197873 83 Linux --ocr 存放配置文件,200M
/dev/sdb2 49 96 197904 83 Linux --vot disk 解决双节点脑裂问题200m(双节点强vot disk的票,强到的就去除另外节点)。三节点时候,三个节点投票,2票的节点就是正常节点,1票的节点就被踢出出集群。会锁住被踢出节点对共享盘的访问,被去除的节点将自己重启。
/dev/sdb3 97 500 1665692 83 Linux --asm disk 存放数据文件2G
/dev/sdb4 501 904 1665692 83 Linux --asm disk 存放数据文件2G
格式化磁盘在任何磁盘做都行,但是在服务端和客户端都要使用partprobe使共享盘的分区生效!
将分区配置为裸设备:在所有节点都要做!
vi /etc/udev/rules.d/60-raw.rules
ACTION=="add", KERNEL=="sdb1", RUN+="/bin/raw /dev/raw/raw1 %N"
ACTION=="add", KERNEL=="sdb2", RUN+="/bin/raw /dev/raw/raw2 %N"
ACTION=="add", KERNEL=="sdb3", RUN+="/bin/raw /dev/raw/raw3 %N"
ACTION=="add", KERNEL=="sdb4", RUN+="/bin/raw /dev/raw/raw4 %N"
KERNEL=="raw[1]", MODE="0660", GROUP="oinstall", WNER="root"
KERNEL=="raw[2]", MODE="0660", GROUP="oinstall", WNER="oracle"
KERNEL=="raw[3-4]", MODE="0660", GROUP="oinstall", WNER="oracle"
运行之前现运行install.sh创建oracle用户。
# start_udev
# raw -qa
# ll /dev/raw/*
三.网络参数配置
网络配置注意事项:
ip使用静态配置:static
网关要指定,两台机器的priv和public的网关要设置的一样.
主机网络设置,每个RAC节点的/etc/hosts:
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.10 stu10
192.168.1.12 stu12
# Public Virtual IP (VIP)
192.168.1.11 stu10-vip
192.168.1.13 stu12-vip
# Private Interconnect - (eth1)
10.0.0.10 stu10-priv
10.0.0.12 stu12-priv
配置hangcheck-timer:
用于监视 Linux 内核是否挂起
vi /etc/modprobe.conf
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
自动加载hangcheck-timer 模块
vi /etc/rc.local
modprobe hangcheck-timer
检查hangcheck-timer模块是否已经加载:
lsmod | grep hangcheck_timer
配置SSH互信:
node1:192.168.1.10
su - oracle
ssh-keygen -t rsa
ssh-keygen -t dsa
cd .ssh
cat *.pub > authorized_keys
合并authorized_keys
scp authorized_keys oracle@192.168.1.12:/home/oracle/.ssh/keys_dbs
node2:192.168.1.12
su - oracle
ssh-keygen -t rsa
ssh-keygen -t dsa
cat *.pub > authorized_keys
cat keys_dbs >> authorized_keys
scp authorized_keys oracle@192.168.1.10:/home/oracle/.ssh/
在node1验证对等(192.168.1.10): 即要ssh自身又要ssh远程,密钥才生效!!直到没有YES提示!
ssh 192.168.1.10
ssh stu10
ssh 10.0.0.10
ssh stu10-priv
ssh 192.168.1.12
ssh stu12
ssh 10.0.0.12
ssh stu12-priv
在node2验证对等(192.168.1.12):
ssh 192.168.1.12
ssh stu12
ssh 10.0.0.12
ssh stu12-priv
ssh 192.168.1.10
ssh stu10
ssh 10.0.0.10
ssh stu10-priv
节点时间同步(若 node1 time > node2 time)
在node2: vi /etc/ntp.conf
server 127.127.1.0
fudge 127.127.1.0 stratum 11
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
在node1: /etc/ntp.conf
server 192.168.1.33 prefer
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
#注意是127.127.1.0而不是 127.0.0.1,还有就是192.168.1.33 是节点一的IP地址!
然后在两个节点执行下面的命令使NTP服务启动
/etc/init.d/ntpd start
使用ntpdate -u 192.168.1.33 和时间服务器同步
定时同步:
crontab -e
*/5 * * * * /usr/sbin/ntpdate -u 192.168.1.33
表示每五分钟同步一次。
重启crond服务
service crond restart
也可以双节点配置了root的互信之后:
date -s $(ssh IP date "+%T")
四.安装Clusterware软件
使用CVU校验集群安装可行性:(要在两个节点上都通过)
su - oracle
/mnt/clusterware/cluvfy/runcluvfy.sh stage -pre crsinst -n stu10,stu12 -verbose
安装clusterware:(如果时间不能同步,要在时间较慢的节点上安装集群软件)
只在一个节点安装!
su - oracle
cd /mnt/clusterware/
./runInstaller
注意:修改程序安装路径db_1 ==> crs_1
在弹出要求运行root.sh脚本的对话框时先不要运行root.sh脚本!
root脚本(存储在$CRS_HOME下)是启动集群资源的!
先修改vipca和srvctl脚本:
su - oracle
vi $CRS_HOME/bin/vipca
找到如下内容(line number :121):
Remove this workaround when the bug 3937317 is fixed
arch=`uname -m`
if [ "$arch" = "i686" -o "$arch" = "ia64" ]
then
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
fi
#End workaround
在fi后新添加一行:
unset LD_ASSUME_KERNEL
------------------------------------------------------------
vi $CRS_HOME/bin/srvctl
找到如下内容 (line number : 167 ):
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
同样在其后新增加一行:
unset LD_ASSUME_KERNEL
------------------------------------------------------------
以上两个脚本文件修改之后再运行root.sh
在运行root.sh脚本时如果报下面错误:(一般主节点不报错,副节点会报这个Error错误!)
Error 0(Native: listNetInterfaces:[3])
[Error 0(Native: listNetInterfaces:[3])]
解决:(注意网络适配器名称和IP地址!不要盲目照抄!)
# ./oifcfg iflist
# ./oifcfg setif -global eth0/192.168.1.0:public
./oifcfg setif -global eth0:2/10.0.0.0:cluster_interconnect
./oifcfg getif
unset LANG
./vipca (如果vip是192、172的话都要重新跑一下)
===>添加别名,虚拟IP就会自动出来 ===> finsh
./crs_stat -t (资源都是online的话就可以了。)
----------------------------------------------------------------error
“PRKN-1008: Unable to load the shared library srvmhas10 during root.sh run on first node: ”,“PRKH-1010 : Unable to communicate with CRS services.
[PRKH-1000 : Unable to load the SRVM HAS shared library
[PRKN-1008 : Unable to load the shared library "srvmhas10" or a dependent library, from LD_LIBRARY_PATH="******"
[java.lang.UnsatisfiedLinkError: /****/crs/lib32/libsrvmhas10.so: libclntsh.so.10.1: cannot open shared object file: No such file or directory”
缺少开发包
----error
PRKC-1073 : Failed to transfer directory ...\installCopyFile.lst
没有对等
----error
“The given interface(s), "eth0" is not public.Public interfaces should be used to configure virtual IPs.”
public IP 使用不可路由的 IP 地址 (192.168.x.x) CVU 产生该错,解决方法是手动运行 VIPCA。
五.安装database软件
安装oracle:
只在一个节点安装就可以,选择只安装软件不要创建数据库.
监听配置:
node stu10 192.168.1.10
listener.ora
-----------------------------------------------------------------
LISTENER_stu10 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = sut10-vip)(PORT = 1521)(IP = FIRST))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.10)(PORT = 1521)(IP = FIRST))
)
)
SID_LIST_LISTENER_stu10 =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/app/product/10.2.0/db_1)
(PROGRAM = extproc)
)
)
-----------------------------------------------------------------
tnsnames.ora
-----------------------------------------------------------------
RACDB1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =stu10-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb)
(INSTANCE_NAME = racdb1)
)
)
RACDB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
)
(CONNECT_DATA =
(SERVICE_NAME = racdb)
)
)
RACDB_PRECONNECT =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb_preconnect)
(FAILOVER_MODE =
(BACKUP = racdb)
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
LISTENERS_RACDB =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
RACDB2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb)
(INSTANCE_NAME = racdb2)
)
)
------------------------------------------------------------
node stu12 192.168.1.12
listener.ora
------------------------------------------------------------
SID_LIST_LISTENER_stu12 =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/app/product/10.2.0/db_1)
(PROGRAM = extproc)
)
)
LISTENER_stu12 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521)(IP = FIRST))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.12)(PORT = 1521)(IP = FIRST))
)
)
-------------------------------------------------------------
tnsnames.ora
-------------------------------------------------------------
RACDB1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb)
(INSTANCE_NAME = racdb1)
)
)
RACDB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(LOAD_BALANCE = yes) --客户端的Load-balance配置,是基于连接数量的,连接是随机的!可能不十分均衡!
)
(CONNECT_DATA =
(SERVICE_NAME = racdb)
)
)
RACDB_PRECONNECT =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb_preconnect)
(FAILOVER_MODE = --故障转移的配置
(BACKUP = racdb)
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
LISTENERS_RACDB =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu10-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
)--server端的Load-balance配置,是基于压力的!要使用remote_listener初始化参数
RACDB2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = stu12-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = racdb)
(INSTANCE_NAME = racdb2)
)
)
---------------------------------------------------
dbca创建数据库:
选择clusterware安装模式
==> 选择两个节点
==> 数据数据库名字racdb
==> 选择安装位置的时候要选择ASM安装
==> 选择参数文件存放位置,此时ASM磁盘还未创建,不能选择存放在asm设备上
==> 选择ASM磁盘组:
组名称:dga
冗余模式:外部--asm存放文件的时候不进行备份,只存放单份文件。
一般--asm存放文件的时候进行备份,存放双份文件。
高--asm存放文件的时候进行备份,存放三份文件。
勾选上所有需要的裸设备之后确认。
这时候oracle会格式化磁盘组,并将ASM信息写入到裸设备磁盘的头部!
==> 选择闪回区
==> 添加服务
==> 选择链接模式:per-connect
==> 修改内存参数
==> 创建数据库,这时候要选上创建安装脚本,以防止dbca建库失败
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24756465/viewspace-717894/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/24756465/viewspace-717894/