[文章作者:张宴 本文版本:v1.1 最后修改:2008.07.17 转载请注明出自:http://blog.s135.com]
Citrix NetScaler是一款不错的4-7层硬件负载均衡交换机,市场占有率仅次于F5 BIG-IP,位居第二。NetScaler 8.0是美国思杰系统有限公司(Citrix Systems, Inc)正式推出的最新版本NetScaler产品系列。
从我的实际测试来看,NetScaler 8.0在七层负载均衡方面,性能和功能都要比F5 BIG-IP强。
NetScaler 8.0的负载均衡监控中,可以对MySQL数据库进行健康检查,而F5 BIG-IP目前只支持对Oracle和Microsoft SQL Server数据库的健康检查。
但是,NetScaler 8.0自带的MySQL健康检查脚本(nsmysql.pl)并不完善,它只能检查一条SQL语句执行是否出错,并不能检查MySQL主从结构中的MySQL Slave数据库同步是否正常、表有无损坏、同步延迟是否过大、是否出现错误、非sleeping状态的连程数是否过高等情况。于是,我根据自己的需要,为NetScaler 8.0写了一个MySQL Slave数据库负载均衡健康检查脚本(nsmysql-slave.pl),实现了上述需求。
有了“nsmysql-slave.pl”做健康检查,利用NetScaler的VIP(虚拟IP),就可以完美实现多台MySQL Slave数据库的负载均衡了。当一台MySQL Slave数据库出现不同步、表损坏、同步延迟过大(本脚本中默认设置的落后MySQL主库600秒视为延迟,可根据具体应用修改)、活动连程数太高(本脚本中默认设置的是大于200视为连程数太高,可根据具体应用修改)等情况,“nsmysql-slave.pl”就会自动将其检查出来,并告知NetScaler,NetScaler会将该数据库标识为宕机,从而不将用户的查询请求传送到这台发生故障的数据库上。故障一旦修复,“nsmysql-slave.pl”会自动告知NetScaler,该数据库已经可以使用。
“nsmysql-slave.pl”源代码如下:
- #!/usr/bin/perl -w
- ################################################################
- ##
- ## MySQL Slave Server Monitoring Script V1.1 for NetScaler 8.x
- ## Written by Zhang Yan (blog.s135.com) on July 17, 2008
- ##
- ################################################################
- ## This is a netscaler supplied script.
- ## This script is used to do MySQL slave server monitoring
- ## using KAS feature.
- ## The mandatory arguments are:
- ## 1. Database to which the user is going to connect
- ## 2. User name
- ## The optional arguments are:
- ## 1. password: This is the password that will be used to
- ## login into the server. If no password is
- ## given a blank password is used.
- ## 2. SQL query
- ## Example:
- ## set monitor ... -scriptArgs "database=test;user=user1;password=password;
- ## query=show slave status"
- use strict;
- use DBI;
- use Netscaler::KAS;
- ## This function is a handler for performing MYSQL probe in KAS mode
- sub mysql_probe
- {
- ## There must be at least 3 arguments to this function.
- ## 1. First argument is the IP that has to be probed.
- ## 2. Second argument is the port to connect to.
- ## 3. Arguments to be used during probing.
- if(scalar(@_) < 3)
- {
- return (1,"Arguments not given");
- }
- ## Parse the argument given, to get database,user name,password,SQL query.
- ## If parsing fails, it is monitoring probe failure.
- $_[2]=~/database=([^;]+);user=([^;]+)(;password=([^;]+))?(;query=([^;]+))?/
- or return (1,"Invalid argument format");
- (my $database,my $username,my $password,my $sql_query)=($1,$2,$4,$6);
- ## If no password is given, try blank password
- if(!defined($password))
- {
- $password="";
- }
- ## Try to connect to the server
- my $db_handle = DBI->connect("dbi:mysql:database=$database:host=$_[0]:$_[1]",$username,$password)
- or return (1,"Connection to database failed - $!");
- ## Check MySQL Slave Server
- my $slave_info = $db_handle->prepare("show slave status");
- $slave_info->execute()
- or return (1,"Execution of SQL query failed");
- my $slave_ref = $slave_info->fetchrow_hashref()
- or return (1,"Fetchrow of SQL query failed");
- my $threads_info = $db_handle->prepare("show global status like 'Threads_running'");
- $threads_info->execute()
- or return (1,"Execution of SQL query failed");
- my $threads_ref = $threads_info->fetchrow_hashref()
- or return (1,"Fetchrow of SQL query failed");
- if (exists $slave_ref->{Slave_SQL_Running} and $slave_ref->{Slave_SQL_Running} eq 'No')
- {
- $db_handle->disconnect();
- return (1,"Slave IO thread has stopped");
- }
- elsif (exists $slave_ref->{Slave_IO_Running} and $slave_ref->{Slave_IO_Running} eq 'No')
- {
- $db_handle->disconnect();
- return (1,"Slave IO thread has stopped");
- }
- elsif (exists $slave_ref->{Last_Error} and $slave_ref->{Last_Error} ne '')
- {
- $db_handle->disconnect();
- return (1,"Has some error information");
- }
- elsif (exists $slave_ref->{Seconds_Behind_Master} and $slave_ref->{Seconds_Behind_Master} > 600)
- {
- $db_handle->disconnect();
- return (1,"The seconds behind master more than 600");
- }
- elsif (exists $threads_ref->{Value} and $threads_ref->{Value} > 200)
- {
- $db_handle->disconnect();
- return (1,"The number of threads that are not sleeping more than 200");
- }
- else
- {
- ## If no query is given then it is probe success , else try executing the query
- if(!defined($sql_query))
- {
- $db_handle->disconnect();
- return 0;
- }
- ## Problem during query execution, report failure
- my $statement = $db_handle->prepare($sql_query)
- or return (1,"Preparation of SQL query failed");
- $statement->execute()
- or return (1,"Execution of SQL query failed");
- }
- ## Probe Succeeded.
- $db_handle->disconnect();
- return 0;
- }
- ## Register MS SQL probe handler, to the KAS module.
- probe(/&mysql_probe);
脚本压缩包下载:
健康检查脚本写完了,现在开始配置NetScaler 8.0:
1、使用SecureCRT等SSH客户端工具登录到NetScaler,然后执行以下命令:
cd /nsconfig/monitors/
vi nsmysql-slave.pl
将“nsmysql-slave.pl”的源代码粘贴到其中,然后保存退出,再执行以下命令:
2、检查一下从NetScaler上是否能够连接MySQL Slave数据库:
Enter password:(在此输入MySQL登录密码)
ERROR 1251: Client does not support authentication protocol requested by server; consider upgrading MySQL client
如果你的MySQL Slave服务器版本高于4.0,就会出现以上错误。这是因为MySQL 4.1及其以上版本的密码验证算法与MySQL 4.0及其以下版本不同,而NetScaler 8.0上的MySQL客户端默认版本为4.0.25,因此,4.0.25版本的MySQL客户端连接4.1、5.X、6.X版本的MySQL服务器就会出错。
解决办法1:升级NetScaler 8.0上的MySQL客户端,但最好不要这么做,因为NetScaler与底层的FreeBSD系统和应用软件嵌入很密切的,不要轻易替换成非官方版本,以免导致不兼容、不稳定等情况。
解决方法2:在各台MySQL Slave服务器上新建一个名为“netscaler”的超级管理员帐号,将密码改为使用旧加密算法进行加密的密码。如果从安全考虑,可将以下语句中的%换成NetScaler的Subnet IP。
Server version: 5.1.24-rc MySQL Community Server (GPL)
Type 'help;' or '/h' for help. Type '/c' to clear the buffer.
mysql>
FLUSH PRIVILEGES;
3、各台MySQL Slave数据库必须添加允许NetScaler的Subnet IP访问的帐号,因为在同一网段,不能开启源IP支持,MySQL服务器上看到的将是NetScaler的Subnet IP:
例如:【Web服务器(192.168.1.21)】──→【NetScaler VIP(192.168.1.5)】- - - →【NetScaler Subnet IP(192.168.1.2)】──→【MySQL Slave服务器(192.168.1.31)】
MySQL Slave服务器看到的是IP地址是192.168.1.2,就需要添加NetScaler的Subnet IP访问的帐号('apache'@'192.168.1.2'):
Server version: 5.1.24-rc MySQL Community Server (GPL)
Type 'help;' or '/h' for help. Type '/c' to clear the buffer.
mysql>
FLUSH PRIVILEGES;
4、从Web管理界面登录NetScaler 8.0,进入Configuration页面(需要安装Java Runtime Environment,版本在JRE 1.4.x+以上):
5、点击【Load Balancing】──【Monitors】栏的“add”按钮,添加一个名为“mysql_slave”的MySQL健康检查:
①、Interval:正常情况下,10秒钟检查一次;
②、Response Timeout:每次检查的超时时间为8秒,必须小于Interval;
③、Down Time:宕机状态下,每5秒钟检查一次;
④、Retries:重试5次后仍然检查失败,标记服务器为宕机;
⑤、Type:选择MySQL;
⑥、Script Name:点击其后的“Browse...”按钮,选择我编写的MySQL Slave健康检查脚本“nsmysql-slave.pl”;
⑦、Dispatcher IP和Dispatcher Port必须填“127.0.0.1”和“3013”,不要改变;
⑧、User Name输入刚才创建的帐号“netscaler“,password输入创建帐号时设定的“12345678”。
6、点击【Load Balancing】──【Service Groups】栏的“add”按钮,添加一个名为“pool_mysql”的MySQL服务器池:
①、添加真实MySQL Slave服务器到“pool_mysql”服务器池,协议选择TCP:
②、健康检查方式选择第5步中创建的“mysql_slave”:
7、点击【Load Balancing】──【Virtual Servers】栏的“add”按钮,添加一个名为“vs_mysql_slave”的VIP(虚拟IP):
①、添加“pool_mysql”服务器池到名为“vs_mysql_slave”的VIP(192.168.1.5:3306),协议选择TCP:
②、负载均衡方式选择Least Connection(最小连接数):
8、Web服务器要访问MySQL Slave,只需访问NetScaler的VIP──192.168.1.5的3306端口即可。至此,已经完美解决多台MySQL Slave数据库的负载均衡问题。