设计思路:
用户web业务总出现down机,对应的web界面502,承载web业务的java进程存在假死或者因为自身CPU或内存飙升,而实际不可用
根据curl负载均衡实际的web域名,截取关键字段http_code
根据http_code是不是200,进行两种处理策略
http_code = 200打印业务正常
http_code != 200进行强杀僵死java进程,拉起业务进程
并配置定时任务,每隔5分钟执行
#!/bin/bash
#This is a shell for repair this application imedidately!
#
time=$(date "+%Y-%m-%d %H:%M:%S")
echo "---------------------------------------------------------------------------"
echo "----- start check jax system on ${time} ----"
http_code=`curl -I -m 10 -o /dev/null -s -w %{http_code} noobing.jax.com`
#
if [ "$http_code" -eq 200 ]
then
echo "The jax system is running normally."
else
echo "The jax system is running abnormally."
ps -ef | grep java | grep eureka | awk '{print $2}' | xargs kill -9
docker stop noobing-gateway
docker rm noobing-gateway
docker run --net=host -v /log:/var/log/app/noobing:rw -d -ti --name=noobing-gateway -p 9999:9999 swr.cn-south-1.myhuaweicloud.com/jax-docker/noobing-gateway:v1.1
fi
echo "--- end check jax system on ${time} -----"
#注释:
http_code=` curl -I -m 10 -o /dev/null -s -w %{http_code}noobing.jax.com`
探测指定域名,只返回其HTTP状态码
-I 仅测试HTTP头
-m 10 最多查询10s
-o /dev/null 屏蔽原有输出信息
-s silent
-w %{http_code} 控制额外输出
###############
if 'http_code' -eq 200 判断http_code返回码是否与200相等,相等输出jax系统正常。
else: 执行拉起命令
##########################################
定时任务配置
*/5 * * * * /usr/local/autorepair.sh 2>&1 >> /usr/local/autorepair.log