linux crm高可用网卡,Linux 高可用(HA)集群之heartbeat基于crm进行资源管理

heartbeat基于crm进行资源管理

一、高可用集群之heartbeat基于crm进行资源管理

1、集群的工作模型:

A/P:两个节点,工作与主备模型

N-M N>M,N个节点,M个服务

N-N:N个节点,N个服务

A/A:双主模型:

2、资源转移的方式

rgmanager:failover domain priority

pacemaker:

资源黏性:

资源约束(三种类型):

位置约束:资源更倾向于那个节点上

inf:无穷大

n:

-n:

-inf:负无穷

排列约束:资源运行在同一节点的倾向性

inf:

-inf:

顺序约束:资源的启动次序及关闭次序

3、如何让web service中的三个资源:VIP、httpd和filesystem运行于同一节点上

1.排列约束

2.资源组(resource group)

4、如果节点不在是集群节点成员时,如何处理运行于当前节点的资源

stopped:停止

ignore:忽略

freeze:不连接新的请求

suicide:将服务器kill

5、一个资源刚配置完成时,是否启动

target-role?

6、RA类型

heartbeat legacy

LSB

OCF

STONITH

7、资源类型

primitive,native:主资源,只能运行于一个节点

group:组资源

clone:克隆资源

总克隆数,每个节点最多可运行的克隆数

stonith cluster filesystem

master/salve:主从资源

8、分布式锁:

/usr/lib64/heartbeat

hearsources2cib.py

9、图形化配置

ha.cf

crm on

/usr/lib64/heartbeat/ha_propagate 将配置文件传送到别的节点

10、安装gui

heartbeat v2使用crm作为ijiqun资源管理器:需要在ha.cf中添加

crm on

crm通过mgmtd集成监听5560/tcp

需要启动hb_gui的主机为hacluster用户添加密码,使用hb_gui启动

with quorum:拥有法定票数

without quorum :不拥有法定票数

11、定义高可用的web service

VIP

httpd

from

to:以它为基础

web service

VIP

httpd

NFS

注意haresources与crm不兼容,不被crm所读取

二、配置

1、ha.cf

[root@snn heartbeat]# vim /etc/ha.d/ha.cf

mcast eth0 225.0.100.19 694 1 0

crm on

[root@snn heartbeat]# /usr/lib64/heartbeat/ha_propagate

Propagating HA configuration files to node datanode4.abc.com.

ha.cf      100%   10KB  10.4KB/s   00:00

authkeys    100%  694     0.7KB/s   00:00

Setting HA startup configuration on node datanode4.abc.com.

2、注意haresources与crm不兼容,不被crm所读取

[root@snn heartbeat]# mv /etc/ha.d/haresources /root

底下mv是datanode4的主机

[root@datanode4 ha.d]# mv haresources /root/

[root@snn heartbeat]# service heartbeat start

logd is already running

Starting High-Availability services:

Done.

[root@snn heartbeat]# ssh datanode4 'service heartbeat start'

logd is already running

Starting High-Availability services:

Done.

3、查看日志

[root@snn heartbeat]# tail -f /var/log/messages

Jun 19 16:00:29 snn crmd: [2223]: notice: populate_cib_nodes: Node: datanode4.abc.com (uuid: 0862d824-047e-4826-9e26-21a7603f53c8)

Jun 19 16:00:30 snn crmd: [2223]: notice: populate_cib_nodes: Node: snn.abc.com (uuid: 6009ca6a-56eb-4d35-872e-3b8dc0fc9851)

Jun 19 16:00:30 snn crmd: [2223]: info: do_ha_control: Connected to Heartbeat

Jun 19 16:00:30 snn crmd: [2223]: info: do_ccm_control: CCM connection established... waiting for first callback

Jun 19 16:00:30 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connected

Jun 19 16:00:30 snn crmd: [2223]: info: crmd_init: Starting crmd's mainloop

Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client snn.abc.com/crmd now has status [online]

Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client snn.abc.com/crmd now has status [online]

Jun 19 16:00:30 snn crmd: [2223]: notice: crmd_client_status_callback: Status update: Client datanode4.abc.com/crmd now has status [online]

Jun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm

Jun 19 16:00:30 snn cib: [2219]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: datanode4.abc.com

Jun 19 16:00:30 snn cib: [2219]: info: cib_ccm_msg_callback: PEER: snn.abc.com

Jun 19 16:00:31 snn crmd: [2223]: info: do_started: Delaying start, CCM (0000000000100000) not connected

Jun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: Got an event OC_EV_MS_NEW_MEMBERSHIP from ccm

Jun 19 16:00:31 snn crmd: [2223]: info: mem_handle_event: instance=5, nodes=2, new=2, lost=0, n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:31 snn crmd: [2223]: info: crmd_ccm_msg_callback: Quorum (re)attained after event=NEW MEMBERSHIP (id=5)

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: NEW MEMBERSHIP: trans=5, nodes=2, new=2, lost=0 n_idx=0, new_idx=0, old_idx=4

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: datanode4.abc.com [nodeid=0, born=3]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011CURRENT: snn.abc.com [nodeid=1, born=5]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW:     datanode4.abc.com [nodeid=0, born=3]

Jun 19 16:00:31 snn crmd: [2223]: info: ccm_event_detail: #011NEW:     snn.abc.com [nodeid=1, born=5]

Jun 19 16:00:31 snn crmd: [2223]: info: do_started: The local CRM is operational

Jun 19 16:00:31 snn crmd: [2223]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_CCM_CALLBACK origin=do_started ]

4、查看集群监控状态

//如果想它只显示一次使用crm_mon --one-shot

[root@snn heartbeat]# crm_mon

Refresh in 6s...

============

Last updated: Fri Jun 19 16:11:34 2015

Current DC: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851)

2 Nodes configured.

0 Resources configured.

============

Node: datanode4.abc.com (0862d824-047e-4826-9e26-21a7603f53c8): online

Node: snn.abc.com (6009ca6a-56eb-4d35-872e-3b8dc0fc9851): online

4、crm的命令工具

[root@snn heartbeat]# crm_sh

/usr/sbin/crm_sh:31: DeprecationWarning: The popen2 module is deprecated.  Use the subprocess module.

from popen2 import Popen3

crm # help

Usage: crm (nodes|config|resources)

crm # nodes

crm nodes # help

Usage: nodes (status|list)

crm nodes # list

crm nodes #

5、安装heartbeat的时候自动创建一个用户hacluster,但没有密码,需要创建

[root@snn heartbeat]# cat /etc/passwd |grep hacluster

hacluster:x:498:498:heartbeat user:/var/lib/heartbeat/cores/hacluster:/sbin/nologin

[root@snn heartbeat]# passwd hacluster

更改用户 hacluster 的密码 。

新的 密码:

无效的密码: WAY 过短

无效的密码: 过于简单

重新输入新的 密码:

passwd: 所有的身份验证令牌已经成功更新。

6、直接运行hb_gui

[root@snn ~]# hb_gui

Traceback (most recent call last):

File "/usr/bin/hb_gui", line 41, in

import gtk, gtk.glade, gobject

File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 64, in

_init()

File "/usr/lib64/python2.6/site-packages/gtk-2.0/gtk/__init__.py", line 52, in _init

_gtk.init_check()

RuntimeError: could not open display

以上有错误提示

c381be89e2c343ed4eaf90685393b52f.png

在客户端下载安装Xmanager即可

在重执行命令

59fa94892693e2c35bb9b5bcfc0dfc3f.png

三、ha_gui定义

1、定义主资源名称

0e22e0f375b32e33ea956f2df4f65f16.png

91a963b1e8744109098a603dc354a20f.png

21ae47e603bb90e164b41cbd742f438f.png

7f40646fc6a6246f0385d83571e91b93.png

2、继继定义主资源

20912b6d89e8807c8b39e387ff298888.png

aca1765da65b156dd429a411047f5586.png

42a554749eca226fc8a66e52ff6be68b.png

70ce1c26a221f255234b7eb9c32dc521.png

3、让两个资源运行同一个节点,方法有两种:(1)定义排列约束,(2)定义资源组

(1)定义排列约束

1f12159833af2c8d92e842dbf3fbdc10.png

092c76556f7117d7f5b39f100354b858.png

53443f463ff03f314bce86924e99b043.png

27ef805d1b0231ea3c4b6f75330047da.png

4、让snn节点成为备的

705799ce652f62fb517314d20021884c.png

9c45d869fd93b2d729b70a9983b5ddf4.png

四、定义组的方式

web server:

vip:192.168.1.8

httpd

nfs:/192.168.1.4:/web/htdocs挂在到/var/www/html

1、删除原来主资源

99833ef4d7a86f98beca4e8afd68c48d.png

2、定义群主源

8cf8193c4c5517d95d358f97c29b4ef5.png

3fdfe4c6c58a1b102b95f64557485a86.png

e8641299c8595ad67aab2501f56e2b8c.png

cb9f76dd609e7e121d9eec3ef97f9146.png

548de78f277f5edebc027947c939dfd8.png

e263af7e34facf20680b1579cf475c31.png

6d2568ffeeff575099e1f907f16f278b.png

3、httpd无法启动,查看日志如下

a19ec05123607bc525e51d69cc217331.png

从日志 来看,nfs正常挂在到4这主机上,但httpd先启动后又关闭,奇怪了

4、来到datanode4这台机子,单独启动httpd看看,没有成功

[root@datanode4 ~]# /etc/init.d/httpd restart

停止 httpd:             [失败]

正在启动 httpd:Syntax error on line 292 of /etc/httpd/conf/httpd.conf:

5、查看SElinux状态,吓了一跳,问题出现在这里

[root@datanode4 conf]# getenforce

Enforcing

[root@datanode4 conf]# setenforce 0

[root@datanode4 conf]# getenforce

Permissive

把配置文件改成disabled

[root@datanode4 conf]# vim /etc/selinux/config

SELINUX=disabled

6、单独在启动httpd看看

[root@datanode4 conf]# /etc/init.d/httpd start

正在启动 httpd:           [确定]

[root@datanode4 conf]# /etc/init.d/httpd stop

停止 httpd:

7、再回到snn输入hb_gui看看,之前是webserivce,这不影响名称是可以随便定义,我之前删了,就重新建一资源为了好识别,就定义了httpd

41176668256aab94a79a7ba19923af67.png

五、验证

1、nfs的index.html内容

[root@datanode ~]# cat /web/htdocs/index.html

datanode.abc.com

[root@datanode ~]# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:0C:29:50:AC:6E

inet addr:192.168.1.4  Bcast:192.168.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fe50:ac6e/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:83505 errors:0 dropped:0 overruns:0 frame:0

TX packets:2037 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:7403212 (7.0 MiB)  TX bytes:228350 (222.9 KiB)

2、datanode4的主机的vip地址,如果单纯输入ifocnfig,不能显示出来的,它没有利用别名来定义,所以要用的ip addr show

[root@datanode4 html]# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:0C:29:E1:2F:66

inet addr:192.168.1.6  Bcast:192.168.1.255  Mask:255.255.255.0

inet6 addr: fe80::20c:29ff:fee1:2f66/64 Scope:Link

UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

RX packets:147365 errors:0 dropped:0 overruns:0 frame:0

TX packets:66651 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

RX bytes:20284443 (19.3 MiB)  TX bytes:14571080 (13.8 MiB)

[root@datanode4 html]# ip addr show

1: lo: mtu 65536 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 00:0c:29:e1:2f:66 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.6/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.8/24 brd 192.168.1.255 scope global secondary eth0 //显示vip地址

inet6 fe80::20c:29ff:fee1:2f66/64 scope link

valid_lft forever preferred_lft forever

3、在浏览器输入,注意这里输入的是vip地址

f4c48f161adaca53d6919f2a76ac2913.png

4、如果datanode4成为备用

46f9871fb274174fadb164f02e12b7dc.png

到snn主机上看,转移成功

[root@snn ~]# ip addr show

1: lo: mtu 16436 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 00:0c:29:b1:89:48 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.5/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.8/24 brd 192.168.1.255 scope global secondary eth0

inet6 fe80::20c:29ff:feb1:8948/64 scope link

valid_lft forever preferred_lft forever

涮新浏览器,还是原来的内容

0ad0c3ad3d20e800ce8c46787c8d9ea3.png

640d4d4d6be6eaaad5e3f57aacac0e17.png

589cb1bbaae54a26a1b0452cbed3c197.png

---END--

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值