2min内要ping完1000台机器并且拿到结果，怎么做？

最新推荐文章于 2024-04-29 10:17:14 发布

lm236236

最新推荐文章于 2024-04-29 10:17:14 发布

阅读量439

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/lm236236/article/details/123714041

版权

ping监控多线程上下文切换队列性能优化

关键词由CSDN通过智能技术生成

python 专栏收录该内容

13 篇文章 0 订阅

订阅专栏

背景：最近在搞一个可视化的监控大屏，其中一部分的数据就是对1000台物理机进行实时ping探测。

起初只是100多台机器，于是就偷了个懒，用Shell粗略地写了个脚本。从MySQL中先读取这100台机器的IP，然后挨个ping并将结果写回MySQL。当需要监控的机器到达300台时，此时已经不能满足2min内ping完这些机器并返回结果了。而且load average常常都在1.00以上（主机为1核1G虚拟机）。那么这样的情况下该如何优化呢？总不至于一个ping脚本就把负载跑满了吧。要想优化，首先就要考虑优化的方向，哪些地方都有优化空间。

ping命令是一个比较耗时的命令，特别是在不通的情况下。因为有时是网络的问题导致ping包在路上跑了很久才到目的地。那么为了保障我们的ping探测有效，降低误报的概率。我一般会加-w 3这个参数，表示等待3秒，若3秒内ping包没有到达目的地，就认为ping不通。shell脚本中是串行执行，那么就会导致很多时间浪费在等待中。而且随着量越来越多等待时间呈倍数增长。

另外，shell脚本中是一条一条的命令，每次执行命令都会有上下文切换的过程。这个过程其实CPU没有执行有效进程。所以shell脚本越长，无疑CPU浪费在上下文切换的时间越多。

定位到了根因，那么优化的方向就明确了。把握住两点：

1.减少等待的时间。

2.减少上下文切换次数。

工作后没太多时间写文章，并且是测试代码，注释不是很完整，有些看不懂的地方Google查一下对应的模块说明。

#!/usr/bin/python3
import queue
import threading
import subprocess
import os

r = os.popen('****')
ipList=[]
for i in r.readlines():
    ip = i.replace("\n","")
    ipList.append(ip)

exitflag = 1
class pingthread(threading.Thread):
    def __init__(self, threadID, name, q):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.q = q
    def run(self):
        print("start thread:" + self.name)
        process_data(self.name,self.q)
        print("stop thread:" + self.name)

#锁，当一个线程获得锁之后，其余现场皆不能拿到锁。这里的锁机制主要是为了一个线程获取队列一个值的时候，其余线程不能使用队列。以保证不会出现数据混乱
def process_data(threadname,q):
    while exitflag:
        queueLock.acquire()
        if not workqueue.empty():
            data = q.get()
            queueLock.release()
            ret = subprocess.call('ping -c 1 -w 3 %s'%data, shell=True, stdout=open('/dev/null','w'), stderr=subprocess.STDOUT)
            if ret == 0:
                print('%s: %s is up'%(threadname,data))
            else:
                print('%s: %s is down'%(threadname,data))
        else:
            queueLock.release()

threadNum = 50
threads = []
queueLock = threading.Lock()
workqueue = queue.Queue()

for i in range(threadNum):
    thread = pingthread(i,'thread-'+str(i),workqueue)
    thread.start()
    threads.append(thread)

queueLock.acquire()
for ip in ipList:
    workqueue.put(ip)
queueLock.release()

while not workqueue.empty():
    pass

exitflag = 0


for thread in threads:
    thread.join()

print('退出主线程')