我们来一个不断进阶的fpm监控任务,熟悉python脚本的编写。选择python是因为python具备很好的扩展性,比如需要elasticSearch的功能pip安装一下就可以了。完成本脚本后,我们可以了解1分钟频次以内的fpm进程情况,比如进程总数、活跃进程数。
首先我们确定一下fpm可以怎么监控: 1. fpm需要开放一个固定的地址,返回进程信息 2. nginx代理此地址 3. 监控进程访问nginx获取 4. 汇总到ElasticSearch,可以做图看进程数变化
php-fpm的配置,取消pm.status_path前面默认的";"即可,也可以指定自己的url,重启fpm生效
; see php-fpm.d/www.conf
pm.status_path = /status
nginx配置:
server {
listen 80;
server_name fpm9000;
root /tmp;
index index.php index.html index.htm;
location /status {
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_split_path_info ^(.+\.php)(/.*)$;
fastcgi_param PATH_INFO $fastcgi_path_info;
include fastcgi.conf;
}
}
完成上面两个步骤,我们可以在nginx所在的服务器核验一下:
curl -H "host:fpm9000" localhost/status
pool: www
process manager: dynamic
start time: 30/Mar/2019:15:27:36 +0800
start since: 493473
accepted conn: 59477
listen queue: 0
max listen queue: 129
listen queue len: 128
idle processes: 353
active processes: 1
total processes: 354
max active processes: 349
max children reached: 0
slow requests: 0
另外,fpm监控支持json输出,我们可以直接在监控程序里使用
curl -H "host:fpm9000" "localhost/status?json"
{"pool":"www","process manager":"dynamic","start time":1553930856,"start since":493566,"accepted conn":59479,"listen queue":0,"max listen queue":129,"listen queue len":128,"idle processes":353,"active processes":1,"total processes":354,"max active processes":349,"max children reached":0,"slow requests":0}
下面我们开始用python开始监控脚本的编写,涉及如下知识点: 1. curl获取数据 2. json数据处理,录入ES 3. 循环处理多个任务,并接受脚本参数 4. 多线程处理
1 curl获取数据(history/v1.py)
#!/usr/bin/python
import pycurl
import StringIO
def check_and_save(server):
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://"+server[1]+"/status?json")
curl.setopt(pycurl.HTTPHEADER, ["host:"+server[2]])
result = StringIO.StringIO()
curl.setopt(pycurl.WRITEFUNCTION, result.write)
curl.perform()
body = result.getvalue()
print(server)
print(body)
server = ["lhq01", "172.17.83.146", "fpm9000"]
check_and_save(server)
程序解读:导入需要的类
定义一个独立的功能函数check_and_save初始化curl
设置请求地址
设置输出到StringIO
执行curl后得到返回结果
调用check_and_save执行
参数的结构是 [服务器名称 ip 虚拟主机名]
下一步,我们需要把结果存入ES,结果我们把服务器名称记录在结果里,需要json解析支持。
2 json对象处理,录入ES(history/v2.py)
#!/usr/bin/python
from datetime import datetime
import pycurl
import StringIO, json
from elasticsearch import Elasticsearch
def check_and_save(server, es):
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://"+server[1]+"/status?json")
curl.setopt(pycurl.HTTPHEADER, ["host:"+server[2]])
result = StringIO.StringIO()
curl.setopt(pycurl.WRITEFUNCTION, result.write)
curl.perform()
body = result.getvalue()
print(server)
print(body)
data = json.loads(body)
data["@timestamp"] = datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%f+0800")
data["host"] = server[0]
es.index(index = "fpm", doc_type="monitor", body=data)
es = Elasticsearch("172.17.83.146:9200")
server = ["lhq01", "172.17.83.146", "fpm9000"]
check_and_save(server, es)
这次加入了json字符串解释成对象,同时加入新字段,存入ES。执行脚本后,我们可以在ES找到相应的记录就说明脚本运行正常。
下一步,我们考虑到每次需要运行才能获取到一次,如果用计划任务,我们能够一分钟执行一次,那我们可以在一次调用里,多执行几次即可达到一分钟抓取多次,数据会更加精细。
我们前面构建了一个基本可用的监控脚本,可以运行后获得fpm的监控数据,并且在ES里可以得到运行状态,如下图:
3 多任务处理和传入参数(history/v3.py)
#!/usr/bin/python
from datetime import datetime
import socket, pycurl
import StringIO, json
import time, sys
from elasticsearch import Elasticsearch
def check_and_save(server, es):
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://"+server[1]+"/status?json")
curl.setopt(pycurl.HTTPHEADER, ["host:"+server[2]])
result = StringIO.StringIO()
curl.setopt(pycurl.WRITEFUNCTION, result.write)
curl.perform()
body = result.getvalue()
print(server)
print(body)
data = json.loads(body)
#print(data)
data["@timestamp"] = datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%f+0800")
data["host"] = server[0]
#print(data)
es.index(index = "fpm", doc_type="monitor", body=data)
list = [
["mgc01", "172.17.83.146", "fpm9000"],
["mgc01", "172.17.83.146", "fpm9001"]
]
es = Elasticsearch("172.17.83.146:9200")
count = int(sys.argv[1]) if len(sys.argv)>1 else 1
sleep = int(sys.argv[2]) if len(sys.argv)>2 else 10
for i in range(0,count):
for server in list:
check_and_save(server, es)
time.sleep(sleep)
我们从命令行调用来看
/path/to/me.py count sleep
我们这次可以让程序执行N次,每次休息10秒 那如果我们要监控的服务很多,会造成阻塞,所以导致我们一分钟的执行计划会不符合预期,那我们可以考虑让python在线程里执行检查。
4 多线程处理(history/v4.py)
这次我们让程序可以使用线程处理获取数据的任务 python里有简单的方式
thread.start_new_thread ( function, args[, kwargs] )
但程序结束时,线程也结束了,需要自己检查线程执行情况。 我们可以使用Thread来进行多线程的任务管理
#!/usr/bin/python
from datetime import datetime
import socket, pycurl
import StringIO, json
import time, sys, threading
from elasticsearch import Elasticsearch
class MyThread(threading.Thread):
def __init__(self, server):
threading.Thread.__init__(self)
self.server = server
def run(self):
print("start thread", self.server)
check_and_save(self.server)
#threadLock.acquire()
# do something with share data
#threadLock.release()
def check_and_save(server):
print("start check_and_save", server, datetime.now())
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://"+server[1]+"/status?json")
curl.setopt(pycurl.HTTPHEADER, ["host:"+server[2]])
result = StringIO.StringIO()
curl.setopt(pycurl.WRITEFUNCTION, result.write)
curl.perform()
body = result.getvalue()
print(body)
data = json.loads(body)
data["@timestamp"] = datetime.now().strftime("%Y-%m-%dT%H:%M:%S.%f+0800")
data["host"] = server[0]
data["tag"] = "thread"
es.index(index = "fpm", doc_type="monitor", body=data)
server_list = [
["mgc01", "172.17.83.146", "fpm9000"],
["mgc01", "172.17.83.146", "fpm9001"]
]
es = Elasticsearch("172.17.83.146:9200")
threadLock = threading.Lock()
threads = []
length = int(sys.argv[1]) if len(sys.argv)>1 else 1
sleep = int(sys.argv[2]) if len(sys.argv)>2 else 10
for i in range(0,length):
for server in server_list:
try:
thread = MyThread(server)
thread.start()
threads.append(thread)
except:
print("error start new thread")
for t in threads:
t.join()
print("tasks all done")
if i < length-1:
time.sleep(sleep)
重点部分的代码:
thread = MyThread(server) 初始化进程
thread.start() 进程开始
threads.append(thread) 加入线程管理列表
for t in threads:
t.join()
上面两行,让主进程阻塞,保证所有子线程执行完成后,才继续执行下一句。至此,我们实行了php-fpm的进程状态监控,可以细化到每分钟内的采集。
我们每完成一个知识点,都在git有相应的文件,详细参考github的项目源码 https://github.com/hqlulu/pyMonitor