subprocess模块教程

最新推荐文章于 2025-01-02 13:51:56 发布

或许对了

最新推荐文章于 2025-01-02 13:51:56 发布

阅读量1.5k

点赞数 3

分类专栏：我的Python教程

原文链接：https://www.pynote.net/archives/490

版权

我的Python教程专栏收录该内容

62 篇文章

订阅专栏

Python的subprocess模块，用来创建和管理子进程（不是线程），并能够与创建的子进程的stdin，stdout，stderr连接通信，获取子进程执行结束后的返回码，在执行超时或执行错误时得到异常。

subprocess模块，用来取代几个老的函数接口，包括：

# subprocess replacement:
os.system
os.spawn*  # os.spawn* means spawn family funtions

以上创建子进程的老接口，就不要再使用了。

从Python3.5版本开始，subprocess模块内部又进行了一次整合，最后就剩下官方推荐的两个接口函数，分别是：

subprocess.run()
subprocess.Popen()

run() 函数的使用场景更多，它的底层调用的是Popen函数，Popen函数更灵活，适合更复杂的场景。我们只要好好学习这两个函数接口的使用，就能够掌握subprocess模块的几乎所有功能。

引入subprocess模块

由于subprocess这个名字很长，考虑到这个模块对外接口的函数和对象名称都比较特别，本文就这样来引入吧：

>>> from subprocess import *
>>> dir()
['CalledProcessError', 'CompletedProcess', 'DEVNULL', 'PIPE', 'Popen', 'STDOUT', 'SubprocessError', 'TimeoutExpired', '__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'call', 'check_call', 'check_output', 'getoutput', 'getstatusoutput', 'run']

call，check_call，check_output，getoutput，getstatusoutput这些函数，都被run函数代替了，它们在存在只是为了保持向下兼容。从上面的打印还可以看出，subprocess这个模块提供的接口并不多。

subprocess.run() 函数的使用

如前文介绍，从Python3.5开始，出现了run函数，用来代替之前版本的一些函数接口。run函数的作用是：执行args参数所表示的命令，等待命令执行完毕，返回一个CompletedProcess对象。注意，run函数是同步函数，要等待！

run()函数的接口参数：

subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, 
capture_output=False, shell=False, cwd=None, timeout=None, 
check=False, encoding=None, errors=None, text=None, env=None,
 universal_newlines=None)

args参数，就是要通过创建进程而执行的命令及参数，run函数通过args来创建一个进程并执行。

shell参数，表示是否通过shell来执行命令（Linux下默认为/bin/sh），默认是False，这时args只能是一个不带参数的命令字符串，或者是命令和参数组成的一个list，如果shell=True，args就可以是一个我们常见的命令字符串。下面举例：

>>> run('ls')
acme.sh  domain.key  git-2.9.5.tar.gz  lina         Python-3.7.1      
teapot  test.sh      dataset  git-2.9.5   lamp      private.key  
Python-3.7.1.tgz     test
CompletedProcess(args='ls', returncode=0)
>>> run(['ls','-lh'])
total 28M
drwxrwxr-x.  6 xinlin xinlin  130 Apr  5 16:56 acme.sh
drwxrwxr-x.  5 xinlin xinlin  101 Jan 12 10:37 dataset
-rw-rw-r--.  1 xinlin xinlin 3.2K Apr  7 13:45 domain.key
drwxrwxr-x. 23 xinlin xinlin  20K Dec 29  2018 git-2.9.5
-rw-rw-r--.  1 xinlin xinlin 5.7M Dec 28  2018 git-2.9.5.tar.gz
drwxrwxr-x.  3 xinlin xinlin  108 May 31 22:12 lamp
drwxrwxr-x.  3 xinlin xinlin  159 Feb 23 15:27 lina
-rw-rw-r--.  1 xinlin xinlin 3.2K Apr  7 13:42 private.key
drwxr-xr-x. 19 xinlin xinlin 4.0K Dec 28  2018 Python-3.7.1
-rw-rw-r--.  1 xinlin xinlin  22M Dec 28  2018 Python-3.7.1.tgz
drwxrwxr-x.  5 xinlin xinlin 4.0K May 31 22:33 teapot
drwxrwxr-x.  2 xinlin xinlin   22 Jun 21 20:22 test
-rw-rw-r--.  1 xinlin xinlin   48 Apr  6 15:23 test.sh
CompletedProcess(args=['ls', '-lh'], returncode=0)
>>> run('ls -lh', shell=True)
total 28M
drwxrwxr-x.  6 xinlin xinlin  130 Apr  5 16:56 acme.sh
drwxrwxr-x.  5 xinlin xinlin  101 Jan 12 10:37 dataset
-rw-rw-r--.  1 xinlin xinlin 3.2K Apr  7 13:45 domain.key
drwxrwxr-x. 23 xinlin xinlin  20K Dec 29  2018 git-2.9.5
-rw-rw-r--.  1 xinlin xinlin 5.7M Dec 28  2018 git-2.9.5.tar.gz
drwxrwxr-x.  3 xinlin xinlin  108 May 31 22:12 lamp
drwxrwxr-x.  3 xinlin xinlin  159 Feb 23 15:27 lina
-rw-rw-r--.  1 xinlin xinlin 3.2K Apr  7 13:42 private.key
drwxr-xr-x. 19 xinlin xinlin 4.0K Dec 28  2018 Python-3.7.1
-rw-rw-r--.  1 xinlin xinlin  22M Dec 28  2018 Python-3.7.1.tgz
drwxrwxr-x.  5 xinlin xinlin 4.0K May 31 22:33 teapot
drwxrwxr-x.  2 xinlin xinlin   22 Jun 21 20:22 test
-rw-rw-r--.  1 xinlin xinlin   48 Apr  6 15:23 test.sh
CompletedProcess(args='ls -lh', returncode=0)

第1,6,22行，分别代表了3中不同的args参数的使用方式。

注意run函数返回的CompletedProcess对象，里面包含了args，以及命令执行的返回码。下面的代码示例，说明了访问CompletedProcess对象的方式。

>>> proc = run('ls')
Desktop    Downloads	     Music     Public	  test
Documents  examples.desktop  Pictures  Templates  Videos
>>> proc.args
'ls'
>>> proc.returncode
0

CompletedProcess对象还可能包含更多的数据，请注意后面的代码示例。

stdin参数，指定命令的输入途径；

stdout参数，指定命令的输出途径；默认为None，如上面的代码示例，输出就直接打印出来了；

stderr参数，指定命令的error输出途径；

input参数，命令的具体输入内容，默认None，表示没有输入。input与stdin不能同时使用。先看一个有input参数的例子：

>>> proc = run('grep fs',shell=True,input=b'adfs\ncccc\nfsfsf')
adfs
fsfsf
>>> proc
CompletedProcess(args='grep fs', returncode=0)
>>> proc = run('grep fs',shell=True,input=b'adfs\ncccc\nfsfsf',stdout=PIPE)
>>> proc
CompletedProcess(args='grep fs', returncode=0, stdout=b'adfs\nfsfsf\n')
>>> proc.stdout
b'adfs\nfsfsf\n'

input默认是一个bytes流。

stdout=PIPE，表示将stdout重定向到管道，用了这个参数，grep fs命令的结果，就不会直接打印出来，而是存入了proc.stdout这个管道内。

下面的例子用到了stderr：

>>> proc = run('ls fs',shell=True,stdout=PIPE,stderr=PIPE)
>>> proc.stdout
b''
>>> proc.stderr
b"ls: cannot access 'fs': No such file or directory\n"

看一个stdout与input配合起来使用的例子，有点像我们在Linux shell输入的有管道的命令行：

>>> proc = run('grep fs',shell=True,input=b'adfs\ncccc\nfsfsf',stdout=PIPE)
>>> run('cat -n',shell=True, input=proc.stdout)
     1	adfs
     2	fsfsf
CompletedProcess(args='cat -n', returncode=0)

下面是使用stdin的代码例子，stdin的来源是一个文件：

>>> f = open('tt.t','r')
>>> proc = run('cat -n', shell=True, stdin=f)
     1	12345
     2	abcde
     3	xyz..
>>> f.close()

有一个在命令行常见的用法，就是把stderr重定向到stdout，如下：

>>> proc = run('ls kk', shell=True, stdout=PIPE, stderr=STDOUT)
>>> proc.stdout
b"ls: cannot access 'kk': No such file or directory\n"

capture_output参数，这个参数顾名思义就是捕获进程的输出，stdout和stderr。capture_output=True的效果与设置stdout=PIPE, stderr=PIPE一样。设置了capture_output=True，就不能再设置stdout和stderr：

>>> proc = run('ls kk', shell=True, capture_output=True)
>>> proc
CompletedProcess(args='ls kk', returncode=2, stdout=b'', stderr=b"ls: cannot access 'kk': No such file or directory\n")
>>> proc.stdout
b''
>>> proc.stderr
b"ls: cannot access 'kk': No such file or directory\n"

使用capture_output=True，只是让代码书写上更简单更短一些。

cwd参数，这个参数指示了当前工作路径。

>>> proc = run('ls -lh', shell=True, cwd='/usr/local')
total 36K
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 bin
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 etc
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 games
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 include
drwxr-xr-x 3 root root 4.0K Jun 28 21:54 lib
lrwxrwxrwx 1 root root    9 Jun 28 21:32 man -> share/man
drwxr-xr-x 6 root root 4.0K Jun 28 23:34 python-3.7.3
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 sbin
drwxr-xr-x 6 root root 4.0K Feb  9 16:15 share
drwxr-xr-x 2 root root 4.0K Feb  9 16:12 src

text参数，universal_newlines参数，这两个参数的作用是一样的，universal_newlines这个参数的存在也是为了向下兼容（Python3.7开始有text参数，3.5和3.6都是universal_newlines参数），因此我们使用text就好了。text参数的作用是，将stdin，stdout，stderr修改为string模式。注意看上面的示例代码，都是bytes流。

>>> run('grep fs', shell=True, input=b'asdfs\nfdfs', capture_output=True)
CompletedProcess(args='grep fs', returncode=0, stdout=b'asdfs\nfdfs\n', stderr=b'')
>>> run('grep fs', shell=True, input='asdfs\nfdfs', capture_output=True, text=True) 
CompletedProcess(args='grep fs', returncode=0, stdout='asdfs\nfdfs\n', stderr='')

timeout参数，设置进程执行的超时时间。如果时间到子进程还未结束， subprocess.TimeoutExpired异常会抛出。timeout参数的单位是秒。

>>> try:
...     run('python3', shell=True, input=b'import time;time.sleep(30)', timeout=1)
... except TimeoutExpired:
...     print('timeout happened...')
... 
timeout happened...

以上代码，就是sleep 30秒，run函数设置timeout为1秒，触发subprocess.TimeoutExpired后，打印一点信息出来。

check参数，如果check=True，在子进程的返回不为0的时候，抛出subprocess.CalledProcessError异常。这时，run函数返回的CompletedProcess对象的returncode不可用。

>>> try:
...     proc = run('ls kk', shell=True, check=True, stderr=PIPE)
... except CalledProcessError:
...     print(proc.returncode)
... 
0

上面这段代码，走到了except里面，因为kk目录不存在，但是打印出来的returncode却是0，run函数没有成功返回，而是抛出异常，因此返回值不可用。

subprocess.Popen()函数的使用

run函数的底层，就是Popen函数。run函数是同步的，要等待子进程实行结束，或者超时。Popen创建子进程后，采用异步的方式，不会等待，要通过poll函数来判断子进程是否执行完毕。整理来说，Popen比run要更加灵活，如果run函数还不能满足你的需求，就考虑Popen吧。

Popen()函数的接口参数：

Popen(args, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, 
preexec_fn=None, close_fds=True, shell=False, cwd=None, env=None, 
universal_newlines=None, startupinfo=None, creationflags=0, 
restore_signals=True, start_new_session=False, pass_fds=(), *,
 encoding=None, errors=None, text=None)

参数args，stdin，stdout，stderr，shell，cwd，universal_newlines，text与run函数的含义和用法都是一样的。

Popen函数的基本用法：

>>> proc = Popen('ls -hl', shell=True, stdout=PIPE, stderr=STDOUT)
>>> out, _ = proc.communicate()
>>> print(out.decode())
total 37M
-rw-r--r--  1 xinlin xinlin  535 Jun 29 06:03 apache_log_reader.py
-rw-r--r--  1 xinlin xinlin 3.2M Jun 30 02:55 py.maixj.sql
-rw-r--r--  1 xinlin xinlin 3.2M Jun 29 19:20 py.online.sql
drwxr-xr-x 19 xinlin xinlin 4.0K Jun 28 23:24 Python-3.7.3
-rw-r--r--  1 xinlin xinlin  22M Mar 25 13:59 Python-3.7.3.tgz
-rw-r--r--  1 xinlin xinlin   27 Jul  5 01:05 sleep.py
-rw-r--r--  1 xinlin xinlin   18 Jul  5 00:10 tt.t
-rw-r--r--  1 xinlin xinlin  800 Jun 29 03:26 walktree.py
-rw-r--r--  1 xinlin xinlin 8.2M Jun 29 05:47 www.access_log_2019_06_28
>>> proc.returncode
0
>>> proc.pid
2985

Popen函数以异步的方式创建一个子进程，返回一个Popen对象。我们通过 communicate 函数来获取stdout和stderr。communicate函数返回一个tuple，以上示例是将stderr=STDOUT，因此使用 _ 来表示为空的stderr。

Popen对象的communicate函数有两个参数，input和timeout，分别用来设置给子进程的输入和超时时间。有timeout参数，表示communicate函数会等待子进程执行结束，或者超时。

>>> proc = Popen('grep fs', shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
>>> out, err = proc.communicate(b'adfs\nfsmnjkl')
>>> out
b'adfs\nfsmnjkl\n'
>>> err
b''

再来一个有timeout的例子：

>>> proc = Popen('python3', shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
>>> try:
...     out,err=proc.communicate(b'import time;time.sleep(30)', 1)
... except TimeoutExpired: 
...     print('time out...')
... 
time out...

Popen对象有一个 wait 成员函数，也可以设置一个timeout来等待子进程的结束：

>>> try:
...     proc=Popen('python3 -c "import time;time.sleep(30)"',shell=True,stdout=PIPE)
...     returncode = proc.wait(15)
... except TimeoutExpired:
...     print('after waiting 15 seconds, timeout finally...')
... 
after waiting 15 seconds, timeout finally...

注意对returncode的赋值，如果timeout发生，returncode就是not defined。当然也可以通过proc.returncode来获取。如果异常，proc.returncode的值是None。

很多时候，我们确定子进程会执行结束，只是无法确定需要的时间长度，这种情况就要用 poll 函数来判断子进程的执行是否结束：

>>> def test_Popen():
...     import time
...     proc=Popen('python3 -c "import time;time.sleep(10)"',shell=True,stdout=PIPE)
...     i = 0
...     while True:
...         returncode = proc.poll()
...         if returncode is None:
...             time.sleep(2)
...             i += 2
...             print('sleep',i,'seconds')
...             continue
...         else:
...             print('sub process is terminated with returncode',returncode)
...             break
... 
>>> test_Popen()
sleep 2 seconds
sleep 4 seconds
sleep 6 seconds
sleep 8 seconds
sleep 10 seconds
sleep 12 seconds
sub process is terminated with returncode 0

Popen对象还有下列几个成员函数，以后有机缘时，再上示例代码吧：

Popen.send_signal(signal)
Popen.terminate()
Popen.kill()

关于subprocess模块使用的更多信息，请直接参考官方的说明：https://docs.python.org/3/library/subprocess.html

-- EOF --