python的subprocess模块使用

最新推荐文章于 2024-03-26 17:11:32 发布

hqzxsc2006

最新推荐文章于 2024-03-26 17:11:32 发布

阅读量853

点赞数

分类专栏： Python

Python 专栏收录该内容

68 篇文章 2 订阅

订阅专栏

subprocess的目的就是启动一个新的进程并且与之通信。

subprocess模块中只定义了一个类: Popen。可以使用Popen来创建进程，并与进程进行复杂的交互。它的构造函数如下：

subprocess.Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)

参数args可以是字符串或者序列类型（如：list，元组），用于指定进程的可执行文件及其参数。如果是序列类型，第一个元素通常是可执行文件的路径。我们也可以显式的使用executeable参数来指定可执行文件的路径。

参数stdin, stdout, stderr分别表示程序的标准输入、输出、错误句柄。他们可以是PIPE，文件描述符或文件对象，也可以设置为None，表示从父进程继承。

如果参数shell设为true，程序将通过shell来执行。

参数env是字典类型，用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。

subprocess.PIPE

　　在创建Popen对象时，subprocess.PIPE可以初始化stdin, stdout或stderr参数。表示与子进程通信的标准流。

subprocess.STDOUT

　　创建Popen对象时，用于初始化stderr参数，表示将错误通过标准输出流输出。

Popen的方法：

Popen.poll()

　　用于检查子进程是否已经结束。设置并返回returncode属性。

Popen.wait()

　　等待子进程结束。设置并返回returncode属性。尽量不用，容易阻塞进程

Popen.communicate(input=None)

　　与子进程进行交互。向stdin发送数据，或从stdout和stderr中读取数据。可选参数input指定发送到子进程的参数。Communicate()返回一个元组：(stdoutdata, stderrdata)。注意：如果希望通过进程的stdin向其发送数据，在创建Popen对象的时候，参数stdin必须被设置为PIPE。同样，如果希望从stdout和stderr获取数据，必须将stdout和stderr设置为PIPE。

Popen.send_signal(signal)

　　向子进程发送信号。

Popen.terminate()

　　停止(stop)子进程。在windows平台下，该方法将调用Windows API TerminateProcess（）来结束子进程。

Popen.kill()

　　杀死子进程。

Popen.stdin，Popen.stdout ，Popen.stderr ，官方文档上这么说：

stdin, stdout and stderr specify the executed programs’ standard input, standard output and standard error file handles, respectively. Valid values are PIPE, an existing file descriptor (a positive integer), an existing file object, and None.

Popen.pid

　　获取子进程的进程ID。

Popen.returncode

　　获取进程的返回值。如果进程还没有结束，返回None。

---------------------------------------------------------------

简单的用法：

[python]view plaincopy
p=subprocess.Popen("dir", shell=True)  
p.wait()  

shell参数根据你要执行的命令的情况来决定，上面是dir命令，就一定要shell=True了，p.wait()可以得到命令的返回值。

如果上面写成a=p.wait()，a就是returncode。那么输出a的话，有可能就是0【表示执行成功】。

---------------------------------------------------------------------------

进程通讯

如果想得到进程的输出，管道是个很方便的方法，这样：

[python]view plaincopy
p=subprocess.Popen("dir", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)  
(stdoutput,erroutput) = p.communicate()  

p.communicate会一直等到进程退出，并将标准输出和标准错误输出返回，这样就可以得到子进程的输出了。

再看一个communicate的例子。

上面的例子通过communicate给stdin发送数据，然后使用一个tuple接收命令的执行结果。

------------------------------------------------------------------------

上面，标准输出和标准错误输出是分开的，也可以合并起来，只需要将stderr参数设置为subprocess.STDOUT就可以了，这样子：

[python]view plaincopy
p=subprocess.Popen("dir", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
(stdoutput,erroutput) = p.communicate()  

如果你想一行行处理子进程的输出，也没有问题：

[python]view plaincopy
p=subprocess.Popen("dir", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
while True:  
    buff = p.stdout.readline()  
    if buff == '' and p.poll() != None:  
        break  

------------------------------------------------------

死锁

但是如果你使用了管道，而又不去处理管道的输出，那么小心点，如果子进程输出数据过多，死锁就会发生了，比如下面的用法：

[python]view plaincopy
p=subprocess.Popen("longprint", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
p.wait()  

longprint是一个假想的有大量输出的进程，那么在我的xp, Python2.5的环境下，当输出达到4096时，死锁就发生了。当然，如果我们用p.stdout.readline或者p.communicate去清理输出，那么无论输出多少，死锁都是不会发生的。或者我们不使用管道，比如不做重定向，或者重定向到文件，也都是可以避免死锁的。

----------------------------------

subprocess还可以连接起来多个命令来执行。

在shell中我们知道，想要连接多个命令可以使用管道。

在subprocess中，可以使用上一个命令执行的输出结果作为下一次执行的输入。例子如下：

例子中，p2使用了第一次执行命令的结果p1的stdout作为输入数据，然后执行tail命令。

- -------------------

下面是一个更大的例子。用来ping一系列的ip地址，并输出是否这些地址的主机是alive的。代码参考了python unix linux 系统管理指南。

[python]view plaincopy
#!/usr/bin/env python  
  
from threading import Thread  
import subprocess  
from Queue import Queue  
  
num_threads=3  
ips=['127.0.0.1','116.56.148.187']  
q=Queue()  
def pingme(i,queue):  
    while True:  
        ip=queue.get()  
        print 'Thread %s pinging %s' %(i,ip)  
        ret=subprocess.call('ping -c 1 %s' % ip,shell=True,stdout=open('/dev/null','w'),stderr=subprocess.STDOUT)  
        if ret==0:  
            print '%s is alive!' %ip  
        elif ret==1:  
            print '%s is down...'%ip  
        queue.task_done()  
  
#start num_threads threads  
for i in range(num_threads):  
    t=Thread(target=pingme,args=(i,q))  
    t.setDaemon(True)  
    t.start()  
  
for ip in ips:  
    q.put(ip)  
print 'main thread waiting...'  
q.join();print 'Done'  

在上面代码中使用subprocess的主要好处是，使用多个线程来执行ping命令会节省大量时间。

假设说我们用一个线程来处理，那么每个 ping都要等待前一个结束之后再ping其他地址。那么如果有100个地址，一共需要的时间=100*平均时间。

如果使用多个线程，那么最长执行时间的线程就是整个程序运行的总时间。【时间比单个线程节省多了】

这里要注意一下Queue模块的学习。

pingme函数的执行是这样的：

启动的线程会去执行pingme函数。

pingme函数会检测队列中是否有元素。如果有的话，则取出并执行ping命令。

这个队列是多个线程共享的。所以这里我们不使用列表。【假设在这里我们使用列表，那么需要我们自己来进行同步控制。Queue本身已经通过信号量做了同步控制，节省了我们自己做同步控制的工作=。=】

代码中q的join函数是阻塞当前线程。下面是e文注释

　Queue.join()

　　Blocks until all items in the queue have been gotten and processed(task_done()).

Python 标准库 subprocess.Popen 是 shellout 一个外部进程的首选，它在 Linux/Unix 平台下的实现方式是 fork 产生子进程然后 exec 载入外部可执行程序。

于是问题就来了，如果我们需要一个类似“夹具”的子进程（比如运行 Web 集成测试的时候跑起来的那个被测试 Server），那么就需要在退出上下文的时候清理现场，也就是结束被跑起来的子进程。

最简单粗暴的做法可以是这样：

process_fixture.py

 @contextlib.contextmanager
 def process_fixture(shell_args):
     proc = subprocess.Popen(shell_args)
     try:
         yield
     finally:
         # 无论是否发生异常，现场都是需要清理的
         proc.terminate()
         proc.wait()


 if __name__ == '__main__':
     with process_fixture(['python', 'SimpleHTTPServer', '8080']) as proc:
         print('pid %d' % proc.pid)
         print(urllib.urlopen('http://localhost:8080').read())

那个 proc.wait() 是不可以偷懒省掉的，否则如果子进程被中止了而父进程继续运行，子进程就会一直占用 pid 而成为僵尸，直到父进程也中止了才被托孤给 init 清理掉。

这个简单粗暴版对简单的情况可能有效，但是被运行的程序可能没那么听话。被运行程序可能会再 fork 一些子进程来工作，自己则只当监工 —— 这是不少 Web Server 的做法。对这种被运行程序如果简单地 terminate，也即对其 pid 发 SIGTERM，那就相当于谋杀了监工进程，真正的工作进程也就因此被托孤给 init，变成畸形的守护进程…… 嗯没错，这就是我一开始遇到的问题，CI Server 上明明已经中止了 Web Server 进程了，下一轮测试跑起来的时候端口仍然是被占用的。

这个问题稍微有点棘手，因为自从被运行程序 fork 以后，产生的子进程都享有独立的进程空间和 pid，也就是它超出了我们触碰的范围。好在 subprocess.Popen 有个 preexec_fn 参数，它接受一个回调函数，并在 fork 之后 exec 之前的间隙中执行它。我们可以利用这个特性对被运行的子进程做出一些修改，比如执行 setsid() 成立一个独立的进程组。

Linux 的进程组是一个进程的集合，任何进程用系统调用 setsid 可以创建一个新的进程组，并让自己成为首领进程。首领进程的子子孙孙只要没有再调用 setsid 成立自己的独立进程组，那么它都将成为这个进程组的成员。之后进程组内只要还有一个存活的进程，那么这个进程组就还是存在的，即使首领进程已经死亡也不例外。而这个存在的意义在于，我们只要知道了首领进程的 pid (同时也是进程组的 pgid)，那么可以给整个进程组发送 signal，组内的所有进程都会收到。

因此利用这个特性，就可以通过 preexec_fn 参数让 Popen 成立自己的进程组，然后再向进程组发送 SIGTERM 或 SIGKILL，中止 subprocess.Popen 所启动进程的子子孙孙。当然，前提是这些子子孙孙中没有进程再调用 setsid 分裂自立门户。

前文的例子经过修改是这样的：

better_process_fixture.py

 import signal
 import os
 import contextlib
 import subprocess
 import logging
 import warnings


 @contextlib.contextmanager
 def process_fixture(shell_args):
     proc = subprocess.Popen(shell_args, preexec_fn=os.setsid)
     try:
         yield
     finally:
         proc.terminate()
         proc.wait()

         try:
             os.killpg(proc.pid, signal.SIGTERM)
         except OSError as e:
             warnings.warn(e)

Python 3.2 之后 subprocess.Popen 新增了一个选项 start_new_session， Popen(args, start_new_session=True) 即等效于 preexec_fn=os.setsid 。

这种利用进程组来清理子进程的后代的方法，比简单地中止子进程本身更加“干净”。基于 Python 实现的 Procfile 进程管理工具 Honcho 也采用了这个方法。当然，因为不能保证被运行进程的子进程一定不会调用 setsid，所以这个方法不能算“通用”，只能算“相对可用”。如果真的要百分之百通用，那么像 systemd 那样使用 cgroups 来追溯进程创建过程也许是唯一的办法。也难怪说 systemd 是第一个能正确地关闭服务的 init 工具。

---------------------------------------------

应用实例解析

2.1 subprocess模块的使用

1. subprocess.call

>>> subprocess.call(["ls", "-l"])
0
>>> subprocess.call("exit 1", shell=True)
1

2. 调用系统中cmd命令,显示命令执行的结果:

x=subprocess.check_output(["echo", "Hello World!"],shell=True)

print(x)
"Hello World!"

3. 在python中显示文件内容:

y=subprocess.check_output(["type", "app2.cpp"],shell=True)

print(y)
#include
using namespace std;
......

4. 查看ipconfig -all命令的输出,并将将输出保存到文件tmp.log中:

handle = open(r'd:\tmp.log','wt')
subprocess.Popen(['ipconfig','-all'], stdout=handle)

5. 查看网络设置ipconfig -all,保存到变量中

output = subprocess.Popen(['ipconfig','-all'], stdout=subprocess.PIPE,shell=True)
oc=output.communicate() #取出output中的字符串

#communicate() returns a tuple (stdoutdata, stderrdata).
print(oc[0]) #打印网络信息

Windows IP Configuration

Host Name . . . . .

6. 如果想频繁地和子线程通信,那么不能使用communicate();

因为communicate通信一次之后即关闭了管道.这时可以试试下面的方法:
p= subprocess.Popen(["wc"], stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True)
p.stdin.write('your command')
p.stdin.flush()
#......do something

try:
#......do something
p.stdout.readline()
#......do something
except:
print('IOError')

#......do something more

p.stdin.write('your other command')
p.stdin.flush()
#......do something more

2.2 subprocess子进程和管道进行交互

其实在python中，和shell脚本，其他程序交互的方式有很多，比如：
os.system(cmd)，os.system只是执行一个shell命令，不能输入、且无返回
os.open(cmd)，可以交互，但是是一次性的，调用都少次都会创建和销毁多少次进程，性能太差

1. 一个简单的例子，调用ls命令，两者之间是没有交互的：

import subprocess
p = subprocess.Popen('ls')

2. 在程序中获取输出的例子：

import subprocess
p = subprocess.Popen('ls',stdout=subprocess.PIPE)
print p.stdout.readlines()

3. 有输入，有输出的例子，

父进程发送'say hi'，子进程输出 test say hi，父进程获取输出并打印
#test1.py
import sys
line = sys.stdin.readline()
print 'test',line

#run.py
from subprocess import *
p =Popen('./test1.py',stdin=PIPE,stdout=PIPE)
p.stdin.write('say hi/n')
print p.stdout.readline()

#result
test say hi

4. 连续输入和输出的例子

# test.py
import sys
while True:
line = sys.stdin.readline()
if not line:break
sys.stdout.write(line)
sys.stdout.flush()

# run.py
import sys
from subprocess import *
proc = Popen('./test.py',stdin=PIPE,stdout=PIPE,shell=True)
for line in sys.stdin:
proc.stdin.write(line)
proc.stdin.flush()
output = proc.stdout.readline()
sys.stdout.write(output)
注意，run.py的flush和test.py中的flush，要记得清空缓冲区，否则程序得不到正确的输入和输出

2.3 python 实时获取子进程输出

1. 方法一

import subprocess

p = subprocess.Popen('ping 127.0.0.1 -n 10', stdout=subprocess.PIPE)
while True:
    line = p.stdout.readline()
    if not line:
        break
    print line

2. 方法二：

import subprocess

p = subprocess.Popen('ping 127.0.0.1 -n 10', stdout=subprocess.PIPE)
while p.poll() == None:
    print p.stdout.readline()
    print p.stdout.read()
    print 'returen code:', p.returncode

3. 方法三：

import subprocess

p = subprocess.Popen('ping 127.0.0.1 -n 10', stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ""):
    print line

hqzxsc2006

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python的subprocess模块使用

subprocess的目的就是启动一个新的进程并且与之通信。subprocess模块中只定义了一个类: Popen。可以使用Popen来创建进程，并与进程进行复杂的交互。它的构造函数如下：subprocess.Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_f
复制链接

扫一扫