Pacemaker+Oracle实现高可用

最新推荐文章于 2025-02-10 10:32:21 发布

Floating warm sun

最新推荐文章于 2025-02-10 10:32:21 发布

阅读量1.2k

点赞数 1

文章标签： oracle 运维 linux

本文链接：https://blog.csdn.net/loveLAxin/article/details/130124687

版权

1.环境规划

db1	192.168.88.5
db2	192.168.88.6
vip	192.168.88.7
共享目录	oracle

2.安装软件包（所有节点操作）

cat rhel.repo 
[rhel]
name=rhel
baseurl=file:///mnt
enabled=1
gpgcheck=0

yum -y install pacemaker pacemaker-cli pacemaker-cluster-libs pacemaker-libs corosync pcs fence-agents-all ruby corosynclib libqb resource-agents drbd-pacemaker gfs2-utils

rpm -ivh pcs-0.9.169-3.el7.centos.x86_64.rpm --nodeps --force

3.配置hosts（所有节点操作）

cat /etc/hosts

192.168.88.5 db1
192.168.88.6 db2
192.168.88.7 vip

4.配置SSH互信

[root@db1 ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:uHkBKtuR1hCGyq3h2xlLeBBm7BFrx1XyYzZMm/UYvBU root@db1
The key's randomart image is:
+---[RSA 2048]----+
|....oo.o.o E.    |
| *+...= +.+.     |
|=++o. .X .o.     |
|o=.. =ooo.       |
|. * = o S        |
| + O . o .       |
|  * = o .        |
| . +   .         |
|                 |
+----[SHA256]-----+

[root@db1 ~]# ssh-copy-id -i .ssh/id_rsa.pub db2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa.pub"
The authenticity of host 'db2 (192.168.80.239)' can't be established.
ECDSA key fingerprint is SHA256:B+nXQP1MTbO6is9399Gsgd0Zq6O7cJoAoGzwaEV4IRY.
ECDSA key fingerprint is MD5:9a:76:97:61:f4:b2:3e:5c:ce:a7:20:dc:51:d1:5a:6d.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@db2's password:  

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'db2'"
and check to make sure that only the key(s) you wanted were added.

[root@db2 ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Pnn2HYhAjAWtZsnDleqrJ5UzeRvTvUO+EfGnfTux3JE root@db2
The key's randomart image is:
+---[RSA 2048]----+
|      .o..       |
|       ++        |
|     o.=o   .    |
|      X.     o   |
|     + +S. .. . o|
|      B.+o..oo E |
|     . =+++oo.+ B|
|    . o .+ .+o =+|
|    .+      oo...|
+----[SHA256]-----+

[root@db2 ~]# ssh-copy-id -i .ssh/id_rsa.pub db1
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@db1's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'db1'"
and check to make sure that only the key(s) you wanted were added.

--测试
ssh db2 date

5.关闭服务（所有节点操作）

--关闭防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service

--关闭selinux
[root@db1 ~]# cat /etc/selinux/config 
SELINUX=disabled

6.启动pcsd（所有节点）

systemctl enable pcsd
systemctl start pcsd
systemctl status pcsd

7.确认组成节点

[root@db1 ~]# pcs cluster auth db1 db2
Username: hacluster
Password: 
db1: Authorized
db2: Authorized

授权文件存储在 ~/.pcs/tokens 或 /var/lib/pcsd/tokens 中。

8.配置集群节点

[root@db1 ~]# pcs cluster setup --start --name oracle db1 db2
Destroying cluster on nodes: db1, db2...
db1: Stopping Cluster (pacemaker)...
db2: Stopping Cluster (pacemaker)...
db2: Successfully destroyed cluster
db1: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'db1', 'db2'
db1: successful distribution of the file 'pacemaker_remote authkey'
db2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
db1: Succeeded
db2: Succeeded

Starting cluster on nodes: db1, db2...
db1: Starting Cluster (corosync)...
db2: Starting Cluster (corosync)...
db1: Starting Cluster (pacemaker)...
db2: Starting Cluster (pacemaker)...

Synchronizing pcsd certificates on nodes db1, db2...
db1: Success
db2: Success
Restarting pcsd on the nodes in order to reload the certificates...
db1: Success
db2: Success

--添加开机自启动
[root@db1 soft]# systemctl enable corosync
Created symlink from /etc/systemd/system/multi-user.target.wants/corosync.service to /usr/lib/systemd/system/corosync.service.
[root@db1 soft]# systemctl enable pacemaker
Created symlink from /etc/systemd/system/multi-user.target.wants/pacemaker.service to /usr/lib/systemd/system/pacemaker.service.

两个配置文件 corosync.conf 和 cib.xml，默认不存在。
用途：
corosync.conf 文件提供 corosync 使用的参数。
cib.xml 是XML文件，存储集群配置及所有资源的信息，pcsd守护程序负责整个节点上同步CIB的内容。
虽然可以手工创建、修改，但建议通过 pcs 工具进行管理和维护。

配置文件已自动生成：
[root@db1 ~]# ll /etc/corosync/corosync.conf
-rw-r--r-- 1 root root 377 Feb  6 13:59 /etc/corosync/corosync.conf

--检查状态
[root@db1 soft]# pcs status
Cluster name: oracle

WARNINGS:
No stonith devices and stonith-enabled is not false

Stack: corosync
Current DC: db1 (version 1.1.23-1.el7-9acf116022) - partition with quorum
Last updated: Tue Feb  7 12:10:49 2023
Last change: Tue Feb  7 12:00:03 2023 by hacluster via crmd on db1

2 nodes configured
0 resource instances configured

Online: [ db1 db2 ]

No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

[root@db1 soft]# systemctl status pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2023-02-07 11:59:40 CST; 11min ago
     Docs: man:pacemakerd
           https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html
 Main PID: 14905 (pacemakerd)
   CGroup: /system.slice/pacemaker.service
           ├─14905 /usr/sbin/pacemakerd -f
           ├─14907 /usr/libexec/pacemaker/cib
           ├─14908 /usr/libexec/pacemaker/stonithd
           ├─14909 /usr/libexec/pacemaker/lrmd
           ├─14910 /usr/libexec/pacemaker/attrd
           ├─14911 /usr/libexec/pacemaker/pengine
           └─14912 /usr/libexec/pacemaker/crmd

Feb 07 12:02:51 db1 crmd[14912]:   notice: State transition S_IDLE -> S_INTEGRATION
Feb 07 12:02:51 db1 attrd[14910]:   notice: Recorded local node as attribute writer (was unset)
Feb 07 12:02:54 db1 pengine[14911]:    error: Resource start-up disabled since no STONITH resources have been defined  --没有 STONITH resources 所以出现此错误，建议关闭STONITH，可忽略
Feb 07 12:02:54 db1 pengine[14911]:    error: Either configure some or disable STONITH with the stonith-enabled option
Feb 07 12:02:54 db1 pengine[14911]:    error: NOTE: Clusters with shared data need STONITH to ensure data integrity
Feb 07 12:02:54 db1 pengine[14911]:   notice: Delaying fencing operations until there are resources to manage
Feb 07 12:02:54 db1 pengine[14911]:   notice: Calculated transition 1, saving inputs in /var/lib/pacemaker/pengine/pe-input-1.bz2
Feb 07 12:02:54 db1 pengine[14911]:   notice: Configuration ERRORs found during PE processing.  Please run "crm_verify -L" to identify issues.
Feb 07 12:02:54 db1 crmd[14912]:   notice: Transition 1 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-1.bz2): Complete
Feb 07 12:02:54 db1 crmd[14912]:   notice: State transition S_TRANSITION_ENGINE -> S_IDLE

--关闭STONITH resources
pcs property set stonith-enabled=false

[root@db1 soft]# systemctl status corosync
● corosync.service - Corosync Cluster Engine
   Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2023-02-07 11:48:06 CST; 28min ago
     Docs: man:corosync
           man:corosync.conf
           man:corosync_overview
 Main PID: 14001 (corosync)
   CGroup: /system.slice/corosync.service
           └─14001 /usr/sbin/corosync start

Feb 07 11:48:06 db1 corosync[14001]:  [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2
Feb 07 11:48:06 db1 corosync[14001]:  [QUORUM] Members[1]: 1
Feb 07 11:48:06 db1 corosync[14001]:  [MAIN  ] Completed service synchronization, ready to provide service.
Feb 07 11:48:06 db1 corosync[14001]:  [TOTEM ] A new membership (192.168.80.238:22) was formed. Members joined: 2
Feb 07 11:48:06 db1 corosync[14001]:  [CPG   ] downlist left_list: 0 received
Feb 07 11:48:06 db1 corosync[14001]:  [CPG   ] downlist left_list: 0 received
Feb 07 11:48:06 db1 corosync[14001]:  [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2
Feb 07 11:48:06 db1 corosync[14001]:  [QUORUM] This node is within the primary component and will provide service.
Feb 07 11:48:06 db1 corosync[14001]:  [QUORUM] Members[2]: 1 2
Feb 07 11:48:06 db1 corosync[14001]:  [MAIN  ] Completed service synchronization, ready to provide service.

[root@db2 soft]# pcs cluster enable --all
db1: Cluster Enabled
db2: Cluster Enabled

9.添加vip资源

[root@db1 soft]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.88.7 cidr_netmask=24 nic=enp0s3 op monitor interval=10s
[root@db1 soft]# pcs status
Cluster name: oracle
Stack: corosync
Current DC: db1 (version 1.1.23-1.el7-9acf116022) - partition with quorum
Last updated: Tue Feb  7 15:50:45 2023
Last change: Tue Feb  7 15:50:42 2023 by root via cibadmin on db1

2 nodes configured
1 resource instance configured

Online: [ db1 db2 ]

Full list of resources:

 VirtualIP	(ocf::heartbeat:IPaddr2):	Started db1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

--测试切换
[root@db1 ~]# pcs resource move VirtualIP db2

10.创建共享卷组

--创建共享卷组
[root@db1 ~]# pvcreate /dev/sdb
  Physical volume "/dev/sdb" successfully created.
[root@db1 ~]# vgcreate vgoracle /dev/sdb
  Volume group "vgoracle" successfully created
[root@db1 ~]# lvcreate -l 100%free -n oracle vgoracle
  Logical volume "oracle" created.
[root@db1 ~]# mkfs -t xfs /dev/vgoracle/oracle
meta-data=/dev/vgoracle/oracle          isize=512    agcount=4, agsize=1310464 blks
         =                              sectsz=512   attr=2, projid32bit=1
         =                              crc=1        finobt=0, sparse=0
data     =                              bsize=4096   blocks=5241856, imaxpct=25
         =                              sunit=0      swidth=0 blks
naming   =version 2                     bsize=4096   ascii-ci=0 ftype=1
log      =internal log                  bsize=4096   blocks=2560, version=2
         =                              sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                          extsz=4096   blocks=0, rtextents=0

[root@db1 ~]# mkdir /oracle
[root@db1 ~]# mount /dev/vgoracle/oracle /oracle/
[root@db1 ~]# df -h
Filesystem                    Size  Used Avail Use% Mounted on
/dev/mapper/vgoracle-oracle    20G   33M   20G   1% /oracle

11.开启lvm ha

lvmconf --enable-halvm --services --startstopservices

vi /etc/lvm/lvm.conf 
volume_list = [ "rhel" ]  ---所有非集群使用的vg

--备份initramfs文件：
[root@db1 boot]# cp /boot/initramfs-3.10.0-957.el7.x86_64.img /tmp/soft/initramfs-3.10.0-957.el7.x86_64.img.bak

--重新生成initramfs文件：
[root@db1 soft]# dracut -f -v
[root@db1 soft]# grub2-mkconfig -o /boot/grub2/grub.cfg

dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

--重启主机：
init 6

12.添加LVM资源

[root@db1 ~]# pcs resource create lv_oracle LVM volgrpname=vgoracle exclusive=true

--切换
[root@db1 ~]# pcs resource move lv_oracle db2

13.添加Filesystem资源

[root@db1 dev]# pcs resource create data_oracle ocf:heartbeat:Filesystem device="/dev/vgoracle/oracle" directory="/oracle" fstype="xfs" op monitor interval=10s

--把创建好的资源添加到db_cluster组中
pcs resource group add db_cluster VirtualIP lv_oracle data_oracle

14.安装数据库

需要将数据库安装在共享目录中。

15.配置静态监听和tnsnames.ora

--静态监听
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = orcldg)
      (ORACLE_HOME = /oracle/app/oracle/product/11.2.0/db_home)
      (SID_NAME = orcldg)
    )
  )

LISTENER =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.88.7 --vip地址)(PORT = 1521))
    )

--配置tnsnames.ora
ORCL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.88.7)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orcl)
    )
  )

16.将文件拷贝到其他节点

scp -p /etc/oraInst.loc db2:/etc/
scp -p /etc/oratab db2:/etc/
scp -p /usr/local/bin/coraenv db2:/usr/local/bin/
scp -p /usr/local/bin/dbhome db2:/usr/local/bin/
scp -p /usr/local/bin/oraenv db2:/usr/local/bin/

16、创建数据库资源

--监听
pcs resource create listener oralsnr sid="orcl" listener="listener" --group=db_cluster

--实例
pcs resource create orcl oracle sid="orcl" --group=db_cluster

17.定义资源依赖

pcs constraint colocation add lv_oracle with VirtualIP
pcs constraint colocation add data_oracle with lv_u01
pcs constraint colocation add listener with data_u01
pcs constraint colocation add orcl with listener

18.切换测试

--切换前
[root@db1 ~]# pcs status
Cluster name: oracle
Stack: corosync
Current DC: db1 (version 1.1.23-1.el7-9acf116022) - partition with quorum
Last updated: Wed Feb 22 09:51:07 2023
Last change: Wed Feb 22 09:42:04 2023 via db1 on db2

2 nodes configured
5 resource instances configured

Online: [ db1 db2 ]

Full list of resources:

 Resource Group: db_cluster
     VirtualIP	(ocf::heartbeat:IPaddr2):	Started db1
     lv_u01	(ocf::heartbeat:LVM):	Started db1
     data_u01	(ocf::heartbeat:Filesystem):	Started db1
     listener	(ocf::heartbeat:oralsnr):	Started db1
     orcl	(ocf::heartbeat:oracle):	Started db1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

--切换后
[root@db2 ~]# pcs status
Cluster name: oracle
Stack: corosync
Current DC: db2 (version 1.1.23-1.el7-9acf116022) - partition with quorum
Last updated: Wed Feb 22 09:55:31 2023
Last change: Wed Feb 22 09:42:04 2023 via db1 on db2

2 nodes configured
5 resource instances configured

Online: [ db2 ]
OFFLINE: [ db1 ]

Full list of resources:

 Resource Group: db_cluster
     VirtualIP	(ocf::heartbeat:IPaddr2):	Started db2
     lv_u01	(ocf::heartbeat:LVM):	Started db2
     data_u01	(ocf::heartbeat:Filesystem):	Started db2
     listener	(ocf::heartbeat:oralsnr):	Started db2
     orcl	(ocf::heartbeat:oracle):	Started db2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled