背景
生产环境中写Hbase性能较差,故开启了三个Hbase.thrift接口,通过haproxy负载均衡去写。今日发现有两个thrift写挂了,仅单个thrift能支撑写入9G的数据量,服务上线11天来首次挂且hbase服务还在,对此问题进行观测,故对此架构不做调整,写个shell监控服务,并监控重启服务即可。
进程监控
shell脚本:supervisory.sh
#!/bin/sh
while true;
do
time1=$(date)
echo $time1
count=`ps -ef|grep thrift | grep -v grep`
if [ "$?" != "0" ];then
echo ">>>>no thrift,run it"
echo ">>>>restart thrift now !"
hbase-daemon.sh start thrift -p 9090
else
echo ">>>>thrift is runing..."
fi
sleep 60
done
启动脚本并输出到日志:
nohup sh supervisory.sh >> supervisory.log 2>&1 &
查看日志: