运维实战 负载均衡Haproxy

与之前提到的LVS不同,HaproxyRealServer并没有过多要求,不需要设置虚拟IP(VIP)也不需要通过arptablearp协议进行控制.

同时,与LVS相比,HAProxy可以运行在4/7层,性能或许不如LVS但是可以进行规则设置实现LVS所不具备的高配置性.通过设置参数可以设置前后端分流以不同的算法.

HAProxy的配置也可以看出起设置主要分为前端和后台,且原生支持端口转发.而LVS的DR模式和隧道模式是不支持转发的,NAT模式支持转发但设置起来困难且可能出现RS与调度机的交互问题.

配置文件实例

#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:80
#    acl url_static       path_beg       -i /static /images /javascript /stylesheets
#    acl url_static       path_end       -i .jpg .gif .png .css .js
#
#    use_backend static          if url_static
    default_backend             app

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
#backend static
#    balance     roundrobin
#    server      static 127.0.0.1:4331 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 172.25.5.2:80 check
    server  app2 172.25.5.3:80 check

上述配置中可能修改而且对业务起作用的部分如下解释.

frontend  main *:80						##设置haproxy前端端口为80
default_backend             app			##默认连接的后端应用为app(app为自定义名称)

backend app								##后端部分的设置
    balance     roundrobin				##选择rr算法进行负载均衡
    server  app1 172.25.5.2:80 check	##RealServer1
    server  app2 172.25.5.3:80 check	##RealServer2

由于调度器的数据通过量通常十分庞大,全部记录会消耗巨大的物理空间,通常在实际使用中不会对日志全部进行记录.

这里为了实验以及了解配置文件的内容,选择对日志进行记录并单独保存.

可以看到配置文件中的日志存储默认设置为:

log         127.0.0.1 local2

也就是本机的local2设备.

local2设备在系统日志管理rsyslog中默认是没有专门设置的,因此为了单独存储HAProxy的日志并保证Messages的清洁,需要进行两步操作.

  • /etc/sysconfig/rsyslog中增加-r参数
  • 修改/etc/rsyslog.conf,设置local2的所有日志不写入messages文件中
  • 修改/etc/rsyslog.conf,设置local2设备的所有日志写入haproxy.log文件中
  • 使用UDP协议并开启UDP接收通道设置,设置接收插件为imudp,接收端口为514
##修改/etc/sysconfig/rsync内容
# Options for rsyslogd
# Syslogd options are deprecated since rsyslog v3.
# If you want to use them, switch to compatibility mode 2 by "-c 2"
# See rsyslogd(8) for more details
SYSLOGD_OPTIONS="-r"
##开启UDP接收端口
# Provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 514

##设置local2的所有日志不写入messages文件中

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local2.none    /var/log/messages

##设置local2设备的所有日志写入haproxy.log文件中
# Save Haproxy messages to haproxy.log
local2.*                       				/var/log/haproxy.log

##注意事项
调度器持续接受的信息量非常庞大,持续进行日志存储会消耗难以想象的物理存储空间,一般是不全部记录的,这里是为了实验所需.

最大句柄修改逻辑(后续实验也适用)

在上面的示范配置中可以看到

maxconn                 3000

当然,实际业务中这样的并发量是肯定不够用的,需要提高,比如我们这里提高到65535.

但是同时,从逻辑上我们可以知道.

应用受控于系统,系统受控于内核.

内核 --> 系统 --> 应用

因此如果应用的最大句柄改为了65535,那么系统和内核分配给这个应用的吞吐量至少也要有65535.

由于内核的我们一般不手动修改,这里演示修改系统最大句柄的方法.

vim /etc/security/limits.conf

##在文件末尾添加
# -表示软硬限制相同
haproxy		- 	nofile		65535

HAProxy的八种调度算法

  1. balance roundrobin # 轮询,软负载均衡基本都具备这种算法
  2. balance static-rr # 根据权重,建议使用
  3. balance leastconn # 最少连接者先处理,建议使用
  4. balance source # 根据请求源IP,建议使用
  5. balance uri # 根据请求的URI
  6. balance url_param # 根据请求的URl参数’balance url_param’ requires an URL parameter name
  7. balance hdr(name) # 根据HTTP请求头来锁定每一次HTTP请求
  8. balance rdp-cookie(name) # 根据据cookie(name)来锁定并哈希每一次TCP请求

简单的访问控制

acl url_static       path_beg       -i /static /images /javascript /stylesheets
acl url_static       path_end       -i .jpg .gif .png .css .js

上面的实例中这一段是被注释掉的,其实这就是调度规则的写法之一.

表达的意思为:

  • 访问请求符合以/static, /images, /javascript, /stylesheets开头时采用static规则调度到符合的后端.
  • 访问请求符合以.jpg, .gif, .png, .css, .js结尾时采用static规则调度到符合的后端.

配置实例

这里的static规则只包含一台RS:172.25.5.3,因此符合上述请求的所有需求都会调度到这台RS上.

#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 65535

listen stats *:8080
    stats uri /status
    stats auth admin:password
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:80
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js

#    acl blacklist src 172.25.0.250/24
#    blcok if blacklist

#    errorloc 403			http://www.baidu.com

    use_backend static          if url_static
    default_backend             app

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
    balance     roundrobin
    server  app2 172.25.5.3:80 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 172.25.5.2:80 check

cat /etc/security/limits.conf 
#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 65535

listen stats *:8080
    stats uri /status
    stats auth admin:password
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:80
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js

#    acl blacklist src 172.25.0.250/24
#    blcok if blacklist

#    errorloc 403			http://www.baidu.com

    use_backend static          if url_static
    default_backend             app

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
    balance     roundrobin
    server  app2 172.25.5.3:80 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 172.25.5.2:80 check
[root@server1 ~]# cat /etc/secur
securetty  security/  
[root@server1 ~]# cat /etc/security/limits.conf 
# /etc/security/limits.conf
#
#This file sets the resource limits for the users logged in via PAM.
#It does not affect resource limits of the system services.
#
#Also note that configuration files in /etc/security/limits.d directory,
#which are read in alphabetical order, override the settings in this
#file in case the domain is the same or more specific.
#That means for example that setting a limit for wildcard domain here
#can be overriden with a wildcard setting in a config file in the
#subdirectory, but a user specific setting here can be overriden only
#with a user specific setting in the subdirectory.
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - a user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open file descriptors
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4

# End of file
haproxy		 -	 nofile		 65535

读写分离

HAProxy的规则不仅能够通过访问的内容来进行分流,也可以对请求的类型进行区分,例如如下需求就可以做到.

  • 所有GET请求流量都打到Server2
  • 所有POST请求流量都打到Server3

这样就可以实现读写分离,所有读操作都访问Server2,而文件上传等写操作都访问Server3

配置实例

#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 65535

listen stats *:8080
    stats uri /status
    stats auth admin:password
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:80
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js

    acl read method get
    acl read method HEAD
    acl write method PUT
    acl write method POST

#    use_backend static          if url_static
    use_backend static 		if write
    default_backend             app

#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
    balance     roundrobin
    server  app2 172.25.5.3:80 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 172.25.5.2:80 check

index.php的内容

<?php
if ((($_FILES["file"]["type"] == "image/gif")
|| ($_FILES["file"]["type"] == "image/jpeg")
|| ($_FILES["file"]["type"] == "image/pjpeg"))
&& ($_FILES["file"]["size"] < 20000))
  {
  if ($_FILES["file"]["error"] > 0)
    {
    echo "Return Code: " . $_FILES["file"]["error"] . "<br />";
    }
  else
    {
    echo "Upload: " . $_FILES["file"]["name"] . "<br />";
    echo "Type: " . $_FILES["file"]["type"] . "<br />";
    echo "Size: " . ($_FILES["file"]["size"] / 1024) . " Kb<br />";
    echo "Temp file: " . $_FILES["file"]["tmp_name"] . "<br />";

    if (file_exists("upload/" . $_FILES["file"]["name"]))
      {
      echo $_FILES["file"]["name"] . " already exists. ";
      }
    else
      {
      move_uploaded_file($_FILES["file"]["tmp_name"],
      "upload/" . $_FILES["file"]["name"]);
      echo "Stored in: " . "upload/" . $_FILES["file"]["name"];
      }
    }
  }
else
  {
  echo "Invalid file";
  }
?>
[root@server2 html]# cat 
index.html       index.php        upload/          upload_file.php  
[root@server2 html]# cat index.php 
<html>
<body>

<form action="upload_file.php" method="post"
enctype="multipart/form-data">
<label for="file">Filename:</label>
<input type="file" name="file" id="file" /> 
<br />
<input type="submit" name="submit" value="Submit" />
</form>

</body>
</html>

upload_file.php的内容

<?php
if ((($_FILES["file"]["type"] == "image/gif")
|| ($_FILES["file"]["type"] == "image/jpeg")
|| ($_FILES["file"]["type"] == "image/pjpeg"))
&& ($_FILES["file"]["size"] < 20000))
  {
  if ($_FILES["file"]["error"] > 0)
    {
    echo "Return Code: " . $_FILES["file"]["error"] . "<br />";
    }
  else
    {
    echo "Upload: " . $_FILES["file"]["name"] . "<br />";
    echo "Type: " . $_FILES["file"]["type"] . "<br />";
    echo "Size: " . ($_FILES["file"]["size"] / 1024) . " Kb<br />";
    echo "Temp file: " . $_FILES["file"]["tmp_name"] . "<br />";

    if (file_exists("upload/" . $_FILES["file"]["name"]))
      {
      echo $_FILES["file"]["name"] . " already exists. ";
      }
    else
      {
      move_uploaded_file($_FILES["file"]["tmp_name"],
      "upload/" . $_FILES["file"]["name"]);
      echo "Stored in: " . "upload/" . $_FILES["file"]["name"];
      }
    }
  }
else
  {
  echo "Invalid file";
  }
?>

自带监控页面展示

认证才能打开监控页面
均可以打开上传页面
均可以打开上传页面
上传预备

实现高可用

LVS相似,HAProxy也需要考虑高可用的问题.

虽然昨天的实验已经结束了,但是需要注意的是:

  • keepalived本身只对机器状况做健康管理
  • 而不对服务本身做监控
  • 所以如果机器本身没有问题而服务出现了问题无法正常运转,keepalived是无法发现的.

因此在使用HAProxy时,通过corosyncpacemaker实现服务级别的高可用和.

  • corosync负责双机热备心跳
  • pacemaker做集群资源管理器

简洁版本叙述

##安装相关软件

##修改两台调度机的hacluster用户密码
echo westos | passwd --stdin hacluster 
ssh server4 'echo westos | passwd --stdin hacluster'

##创建集群相关操作
##认证集群中需要的主机
pcs cluster auth server1 server4
##创建用两台调度机组成的集群
pcs cluster setup --name TestCluster server1 server4
##启动服务并设置开机运行
pcs cluster start --all
pcs cluster enable --all

##查看服务状态
crm_verify -LV
##关闭实验中不需要的stonith-enabled参数
pcs property set stonith-enabled=false
crm_verify -LV

##增加集群服务监控,监控vip的设置并设置vip相关内容并设置条件
pcs resource create VIP ocf:heartbeat:IPaddr2 ip=172.25.5.100 cidr_netmask=24 op monitor interval=30s
##增加集群服务监控,监控haproxy的状态并设置条件
pcs resource create haproxy systemd:haproxy op monitor interval=60s

##创建包含两者的监控绑定的组,保证状调度机切换时同步
pcs resource group add hagroup vip haproxy
pcs resource group add hagroup VIP haproxy

##删除server1的vip,观察服务恢复的过程
ip addr delete 172.25.5.100/24 dev eth0
ip addr
##关闭haproxy,观察服务恢复的过程
systemctl stop haproxy.service 
pcs status

##挂起server1,高可用状态使得流量切换至server4
##恢复server1,状态恢复
pcs node standby 
pcs status
pcs node unstandby 
pcs status

包含日志和运行过程的完整版本

##首先,同样添加高可用软件源
##Server1和Server4都需要做集群处理因此都需要进行一样的操作

[HighAvailability]
name=HighAvailability
baseurl=http://172.25.66.254/rhel6.5/addons/HighAvailability
gpgcheck=0
enabled=1

[root@server1 ~]# scp /etc/yum.repos.d/NeuWings.repo server4:/etc/yum.repos.d/NeuWings.repo
[root@server1 ~]# scp /etc/haproxy/haproxy.cfg server4:/etc/haproxy/haproxy.cfg
[root@server1 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python

##在调度机上设置VIP
[root@server1 ~]# ip addr del 172.25.5.100/24 dev eth0
[root@server4 ~]# ip addr del 172.25.5.100/24 dev eth0

##重载HAProxy服务
[root@server1 ~]# systemctl reload haproxy.service
[root@server4 ~]# systemctl reload haproxy.service

##开启需要的服务支持pcsd
[root@server1 ~]# systemctl enable --now pcsd.service 
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
[root@server1 ~]# ssh server4 systemctl enable --now pcsd.service  
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.

##为集群需要的系统用户设置密码
[root@server1 ~]# echo westos | passwd --stdin hacluster 
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.
[root@server1 ~]# ssh server4 'echo westos | passwd --stdin hacluster'
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

下面正式进入集群搭建部分

##集群节点认证
[root@server1 ~]# pcs cluster auth server1 server4
Username: hacluster
server4: Authorized
server1: Authorized

##创建名为TestCluster的集群,包含两台主机server1 server4
[root@server1 ~]# pcs cluster setup --name TestCluster server1 server4
Destroying cluster on nodes: server1, server4...
server1: Stopping Cluster (pacemaker)...
server4: Stopping Cluster (pacemaker)...
server4: Successfully destroyed cluster
server1: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'server1', 'server4'
server1: successful distribution of the file 'pacemaker_remote authkey'
server4: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
server1: Succeeded
server4: Succeeded

Synchronizing pcsd certificates on nodes server1, server4...
server4: Success
server1: Success
Restarting pcsd on the nodes in order to reload the certificates...
server4: Success
server1: Success

##可以看到此时集群还没有激活并启动
[root@server1 ~]# pcs status
Error: cluster is not currently running on this node

##启动并激活
[root@server1 ~]# pcs cluster start --all
server1: Starting Cluster (corosync)...
server4: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
server4: Starting Cluster (pacemaker)...
[root@server1 ~]# pcs cluster enable --all
server1: Cluster Enabled
server4: Cluster Enabled

##可以看到集群已经激活并启动了
[root@server1 ~]# pcs status
Cluster name: TestCluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Stack: unknown
Current DC: NONE
Last updated: Thu Apr  1 14:36:15 2021
Last change: Thu Apr  1 14:36:04 2021 by hacluster via crmd on server1

2 nodes configured
0 resources configured

Node server1: UNCLEAN (offline)
Node server4: UNCLEAN (offline)

No resources


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
  
##但是日志提示我们stonith-enabled is not false
##此时检查配置数据库也会发现error
[root@server1 ~]# crm_verify -LV
   error: unpack_resources:	Resource start-up disabled since no STONITH resources have been defined
   error: unpack_resources:	Either configure some or disable STONITH with the stonith-enabled option
   error: unpack_resources:	NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid

##设定stonith-enabled=false
##重新检测配置数据库和集群状态,警告消失
[root@server1 ~]# pcs property set stonith-enabled=false
[root@server1 ~]# crm_verify -LV
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:37:08 2021
Last change: Thu Apr  1 14:36:58 2021 by root via cibadmin on server1

2 nodes configured
0 resources configured

Online: [ server1 server4 ]

No resources


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
  

##设定两条监控规则
##名为HAProxy的监控规则,采用systemd:haproxy脚本对服务做监控,60s扫描一次
##名为VIP的监控规则,通过心跳对虚拟IP做监控,30s扫描一次
[root@server1 ~]# pcs resource create haproxy systemd:haproxy op monitor interval=60s
[root@server1 ~]# pcs resource create VIP ocf:heartbeat:IPaddr2 ip=172.25.5.100 cidr_netmask=24 op monitor interval=30s
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:43:33 2021
Last change: Thu Apr  1 14:43:30 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 haproxy	(systemd:haproxy):	Started server4
 VIP	(ocf::heartbeat:IPaddr2):	Started server1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
  
  
##但这样的话可能出现一种我们不想看到的情况
##两条规则分别选择了不同的RS作为节点,出现访问问题.
##因此我们把两条监控规则打包成组,使得其总是指向同一台主机
[root@server1 ~]# pcs resource group add hagroup VIP haproxy
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:44:07 2021
Last change: Thu Apr  1 14:44:03 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server1
     haproxy	(systemd:haproxy):	Starting server1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled


##人工模拟调度机问题的测试部分
##删除VIP,观察到自动切换调度机
##关闭HAProxy,观察到集群尝试恢复服务
[root@server1 ~]# ip addr delete 172.25.5.100/24 dev eth0
[root@server1 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:54:00:1e:4d:a5 brd ff:ff:ff:ff:ff:ff
    inet 172.25.5.1/24 brd 172.25.5.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1e:4da5/64 scope link 
       valid_lft forever preferred_lft forever
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:45:04 2021
Last change: Thu Apr  1 14:44:03 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server1
     haproxy	(systemd:haproxy):	Starting server1

Failed Actions:
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@server1 ~]# systemctl stop haproxy.service 
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:46:05 2021
Last change: Thu Apr  1 14:44:03 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server1
     haproxy	(systemd:haproxy):	FAILED server1

Failed Actions:
* haproxy_monitor_60000 on server1 'not running' (7): call=28, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:46:05 2021', queued=0ms, exec=0ms
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enable
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:46:18 2021
Last change: Thu Apr  1 14:44:03 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server1
     haproxy	(systemd:haproxy):	Started server1

Failed Actions:
* haproxy_monitor_60000 on server1 'not running' (7): call=28, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:46:05 2021', queued=0ms, exec=0ms
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@server1 ~]# systemctl status haproxy.service 
● haproxy.service - Cluster Controlled haproxy
   Loaded: loaded (/usr/lib/systemd/system/haproxy.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/haproxy.service.d
           └─50-pacemaker.conf
   Active: active (running) since Thu 2021-04-01 14:46:07 CST; 1min 2s ago
 Main PID: 17908 (haproxy-systemd)
   CGroup: /system.slice/haproxy.service
           ├─17908 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
           ├─17909 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
           └─17910 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds

Apr 01 14:46:07 server1 systemd[1]: Started Cluster Controlled haproxy.
Apr 01 14:46:07 server1 haproxy-systemd-wrapper[17908]: haproxy-systemd-wrapper: executing /usr/sbi...Ds
Hint: Some lines were ellipsized, use -l to show in full.
[root@server1 ~]# pcs node standby 
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:50:08 2021
Last change: Thu Apr  1 14:50:07 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Node server1: standby
Online: [ server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server1
     haproxy	(systemd:haproxy):	Stopping server1

Failed Actions:
* haproxy_monitor_60000 on server1 'not running' (7): call=28, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:46:05 2021', queued=0ms, exec=0ms
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@server1 ~]# pcs node unstandby 
[root@server1 ~]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 14:50:38 2021
Last change: Thu Apr  1 14:50:23 2021 by root via cibadmin on server1

2 nodes configured
2 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server4
     haproxy	(systemd:haproxy):	Started server4

Failed Actions:
* haproxy_monitor_60000 on server1 'not running' (7): call=28, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:46:05 2021', queued=0ms, exec=0ms
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

脑裂现象与Fence机制

在上面的部分我们搭建了集群

集群中每个节点互相会发送心跳包进行判断存活性

假设Server1的心跳出现问题,则Server4会认为Server1出了问题; 资源会被调度到Server2上运行.

但与此同时, Server1上的资源并没有进行释放,如果其是网卡或其他设备出现问题而服务本身仍在运转,就会出现彼此认为对方出现了问题的结果.

此时就出现了脑裂(Split Brain).

为了防止出现’‘脑裂’'现象,集群中一般会添加Fence设备.

使用服务器本身硬件接口的称为内部Fence, 使用外部电源设备控制的称为外部Fence.

当一台服务器出现超时问题时,为了避免问题导致的不可控后果,Fence设备会对服务器直接发出硬件管理指令,直接将其断电或是重启,同时向其他节点发出信号接管服务.

实验环境

Server1 Server4作为调度机
Server2 Server3作为RealServer
宿主机作为Fence设备

宿主机设置

##检测宿主机Fence套件安装情况,视情况进行补全
rpm -qa | grep ^fence

##安装Fence服务组件
yum install fence-virtd-multicast.x86_64 -y
yum install fence-virtd-libvirt.x86_64 -y
yum install -y fence-virtd.x86_64

##启用服务
systemctl  restart fence_virtd.service 

##配置Fence环境
fence_virtd -c
##通过随即内容生成认证密钥
dd if=/dev/urandom of=fence_xvm.key bs=128 count=1
##查看服务运行在UDP端口的情况
netstat -anulp | grep :1229
##生成Key并传输到调度机
scp /etc/cluster/fence_xvm.key server1:/etc/cluster/
scp /etc/cluster/fence_xvm.key server4:/etc/cluster/

调度机设置

##调度机安装Fence模块
yum install -y fence-virt

##创建配置目录
mkdir /etc/cluster

##查看支持Fence列表
pcs stonith list

##查询fence设备
stonith_admin -I

##在pacemaker中添加fence资源

##在集群中进行主机名与域名的设置
##注意主机名在前,域名在后
##同时设置了30s检测一次的选项
pcs stonith create vmfence fence_xvm pcmk_host_map="server1:Node1;server4:Node4" op monitor interval=30s

##由于已经使用了Fence套件,开启STonith组件功能
pcs property set stonith-enabled=true

##验证集群配置信息
crm_verify -LV

##模拟机器异常情况,观察到宿主机对集群中出现问题的调度机进行了断电操作

实际上做了这么几件事,在宿主机安装了fence-virtd-0.4.0-9.el8.x86_64,fence-virtd-multicast-0.4.0-9.el8.x86_64,fence-virtd-libvirt-0.4.0-9.el8.x86_64,集群中主机上安装fence-virt.

在宿主机对Fence进行配置并生成认证所需的key.

检测宿主机Fetch设置与UDP端口情况.

将认证key传输到集群中的两台调度机上.

为集群设置Fetch需要的设置,并设置主机名与域名配对.

模拟调度机出现问题时的情况(如网卡硬件问题或机器卡死),可以观察到宿主机对出现问题的机器进行了断电操作.

验证方式

  • 执行echo c > /proc/sysrq-trigger命令后会造成系统崩溃,终端连接断开且无法输入命令,可以观测到Fence对崩溃设备进行了重启
  • 同样,down掉网卡设备也会导致集群出现脑裂,可以观测到Fence对网卡出现问题的设备进行了重启

操作流程

[root@foundation5 Desktop]# rpm -qa | grep ^fence
fence-virtd-serial-0.4.0-9.el8.x86_64
fence-virtd-0.4.0-9.el8.x86_64
fence-virtd-tcp-0.4.0-9.el8.x86_64
fence-virt-0.4.0-9.el8.x86_64
fence-virtd-multicast-0.4.0-9.el8.x86_64
fence-virtd-libvirt-0.4.0-9.el8.x86_64

##安装Fence服务组件
yum install fence-virtd-multicast.x86_64 -y
yum install fence-virtd-libvirt.x86_64 -y
yum install -y fence-virtd.x86_64

[root@foundation5 Desktop]# systemctl  restart fence_virtd.service 
[root@foundation5 Desktop]# netstat -anulp | grep :1229
udp        0      0 0.0.0.0:1229            0.0.0.0:*                           32401/fence_virtd   
[root@foundation5 cluster]# scp fence_xvm.key server1:/etc/cluster/
fence_xvm.key                                                         100%  128   190.8KB/s   00:00    
[root@foundation5 cluster]# scp fence_xvm.key server4:/etc/cluster/
fence_xvm.key                                                         100%  128   198.5KB/s   00:00    
[root@server1 ~]# yum install -y fence-virt
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Resolving Dependencies
--> Running transaction check
---> Package fence-virt.x86_64 0:0.3.2-13.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

========================================================================================================
 Package                  Arch                 Version                      Repository             Size
========================================================================================================
Installing:
 fence-virt               x86_64               0.3.2-13.el7                 RHEL7.6                41 k

Transaction Summary
========================================================================================================
Install  1 Package

Total download size: 41 k
Installed size: 82 k
Downloading packages:
fence-virt-0.3.2-13.el7.x86_64.rpm                                               |  41 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : fence-virt-0.3.2-13.el7.x86_64                                                       1/1 
  Verifying  : fence-virt-0.3.2-13.el7.x86_64                                                       1/1 

Installed:
  fence-virt.x86_64 0:0.3.2-13.el7                                                                      

Complete!
[root@server1 ~]# ssh server4 yum install -y fence-virt
root@server4's password: 
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Resolving Dependencies
--> Running transaction check
---> Package fence-virt.x86_64 0:0.3.2-13.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package            Arch           Version                Repository       Size
================================================================================
Installing:
 fence-virt         x86_64         0.3.2-13.el7           RHEL7.6          41 k

Transaction Summary
================================================================================
Install  1 Package

Total download size: 41 k
Installed size: 82 k
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : fence-virt-0.3.2-13.el7.x86_64                               1/1 
  Verifying  : fence-virt-0.3.2-13.el7.x86_64                               1/1 

Installed:
  fence-virt.x86_64 0:0.3.2-13.el7                                              

Complete!
[root@server1 ~]# cd /etc/
[root@server1 etc]# mkdir cluster
[root@server1 etc]# ssh server4 mkdir /etc/cluster
[root@server1 etc]# pcs stonith create vmfence fence_xvm pcmk_host_map="server1:Node1;server4:Node4" op monitor interval=30s
[root@server1 etc]# pcs property set stonith-enabled=true
[root@server1 etc]# pcs status
Cluster name: TestCluster
Stack: corosync
Current DC: server4 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Apr  1 16:18:56 2021
Last change: Thu Apr  1 16:18:50 2021 by root via cibadmin on server1

2 nodes configured
3 resources configured

Online: [ server1 server4 ]

Full list of resources:

 Resource Group: hagroup
     VIP	(ocf::heartbeat:IPaddr2):	Started server4
     haproxy	(systemd:haproxy):	Started server4
 vmfence	(stonith:fence_xvm):	Started server1

Failed Actions:
* haproxy_monitor_60000 on server1 'not running' (7): call=28, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:46:05 2021', queued=0ms, exec=0ms
* VIP_monitor_30000 on server1 'not running' (7): call=18, status=complete, exitreason='',
    last-rc-change='Thu Apr  1 14:45:01 2021', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

生成密钥的过程

Module search path [/usr/lib64/fence-virt]: 

Available backends:
    libvirt 0.3
Available listeners:
    multicast 1.2
    serial 0.4
    tcp 0.1

Listener modules are responsible for accepting requests
from fencing clients.

Listener module [multicast]: 

The multicast listener module is designed for use environments
where the guests and hosts may communicate over a network using
multicast.

The multicast address is the address that a client will use to
send fencing requests to fence_virtd.

Multicast IP Address [225.0.0.12]: 

Using ipv4 as family.

Multicast IP Port [1229]: 

Setting a preferred interface causes fence_virtd to listen only
on that interface.  Normally, it listens on all interfaces.
In environments where the virtual machines are using the host
machine as a gateway, this *must* be set (typically to virbr0).
Set to 'none' for no interface.

Interface [br0]: 

The key file is the shared key information which is used to
authenticate fencing requests.  The contents of this file must
be distributed to each physical host and virtual machine within
a cluster.

Key File [/etc/cluster/fence_xvm.key]: 

Backend modules are responsible for routing requests to
the appropriate hypervisor or management layer.

Backend module [libvirt]: 

The libvirt backend module is designed for single desktops or
servers.  Do not use in environments where virtual machines
may be migrated between hosts.

Libvirt URI [qemu:///system]: 

Configuration complete.

=== Begin Configuration ===
fence_virtd {
	listener = "multicast";
	backend = "libvirt";
	module_path = "/usr/lib64/fence-virt";
}

listeners {
	multicast {
		key_file = "/etc/cluster/fence_xvm.key";
		address = "225.0.0.12";
		interface = "br0";
		family = "ipv4";
		port = "1229";
	}

}

backends {
	libvirt {
		uri = "qemu:///system";
	}

}

=== End Configuration ===
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值