【Python3 高级篇】5. subprocess 子进程管理，取代 os.popen()/os.system()

zzboat0422

已于 2022-11-25 09:46:21 修改

阅读量3.8k

点赞数 5

分类专栏： python 文章标签： python 开发语言 subprocess

于 2022-08-15 17:33:35 首次发布

本文链接：https://blog.csdn.net/zzboat0422/article/details/118386775

版权

python 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

文章目录

0. 参考资料
1. subprocess 简介
2. os.system() 示例
3. os.popen() 示例
4. subprocess 模块
5. 使用场景示例

0. 参考资料

subprocess — 子进程管理官方手册
 subprocess
Python3 subprocess
os.system 官方手册
 os.popen 官方手册
 A simple and general solution for calling OS commands from Python
python(1): subprocess子进程交互

1. subprocess 简介

在多年的发展过程中，Python 演化出了许多种运行 shell 命令的方式，然而，对于当今 Python3.x （3.5 及之后的版本）来说，官方建议的，最好用且功能最全的调用 shell 命令方式，应该就是内置的 subprocess 模块。其他模块，如 os.system()（只能获取返回码），os.popen()（只能获取返回值）等方法均流行于 Python2.x 时代，已经不再发展，可以放弃了。

subprocess 模块打算代替一些老旧的模块与功能，包括：

os.popen
os.popen2
os.system
os.exec
os.spawn

本文示例及文档内容均为 Python 3.8.x 版本

2. os.system() 示例

os.system(command) 的返回值为进程的退出状态。command 的任何输出，均会发送到解释器的标准输出流。
python 代码中调用 os.system() 无法获取到返回值，只能获取到返回码。

>>> os.system('date')
2022年 1月25日 星期二 10时43分22秒 CST
0
>>> r=os.system('date')
Tue Jan 25 09:59:59 CST 2022			# stdout 直接进入标准输出
>>> r
0										# 只能获取到返回码
>>> r=os.system('data')
sh: data: command not found				# stderr 直接进入标准错误输出
>>> r
32512

3. os.popen() 示例

os.popen(command) 通过管道的方式来实现，函数返回一个file-like的对象，可使用文件对象方法，如 read(),readline(),readlines()，获取 command 的执行结果。
在 python3.x 中，此方法是调用 subprocess.Popen() 实现的。
python 代码中调用 os.popen() 无法获取到返回码，只能获取到返回值。但 os.popen() 有一个 close() 方法，如果子进程成功退出，则 close() 方法返回 None。如果发生错误，则返回子进程的返回码。

>>> os.popen('date')
<os._wrap_close object at 0x102323670>		# 默认存放的是一个对象地址
>>> r=os.popen('date')
>>> r
<os._wrap_close object at 0x102348430>
>>> print(repr(r.read()))
'Tue Jan 25 11:03:37 CST 2022\n'			# 调用文件对象方法才能输出结果
>>> r=os.popen('data')
/bin/sh: data: command not found
>>> print(repr(r.read()))
''

# close() 方法示例：
>>> p = os.popen("dir c:", 'r')
>>> p.read()
bla bla... <这里是dir正确的输出>
>>> p.close()
>>> p = os.popen("dir d:", 'r') # 电脑中没有D盘
>>> p.read()
''
>>> p.close()	# 直接在标准错误输出
1
>>>

4. subprocess 模块

在 python 中调用 subprocess 模块的相关方法既可以获取返回码，也可以获取返回值。
subprocess 有一个基础的 subprocess.run() 函数，也有一个更底层的 subprocess.Popen() 类，~~subprocess.call()~~ 相关函数已经在 python3.5 之后废弃。

4.1 `subprocess.run()` 函数

完整参数如下：

subprocess.run(args, *, stdin=None, input=None, stdout=None, stderr=None, capture_output=False, shell=False, cwd=None, timeout=None, check=False, encoding=None, errors=None, text=None, env=None, universal_newlines=None, **other_popen_kwargs)

功能：执行 args 参数所表示的命令，等待命令结束，并返回一个 CompletedProcess 类型对象。

run() 函数调用的底层 Popen 接口。其接受的大多数参数都被传递给 Popen 接口。（timeout, input, check 和 capture_output 除外）。

注意：run() 方法返回的不是我们想要的执行结果或相关信息，而是一个CompletedProcess 类型对象。

args：表示要执行的命令。必须是一个字符串，字符串参数列表。

stdin、stdout和stderr：子进程的标准输入、输出和错误。其值可以是subprocess.PIPE、subprocess.DEVNULL、一个已经存在的文件描述符、已经打开的文件对象或者 None。subprocess.PIPE表示为子进程创建新的管道。subprocess.DEVNULL表示使用os.devnull。默认使用的是 None，表示什么都不做。另外，stderr 可以合并到 stdout 里一起输出，使用 stdout=PIPE 和 stderr=STDOUT。

capture_output：设为 True，stdout 和 stderr 将会被捕获。在使用时，内置的 Popen 对象将自动用 stdout=PIPE 和 stderr=PIPE 创建。stdout 和 stderr 参数不应当与 capture_output 同时提供。此参数仅在 Python3.7 及以上版本支持。

timeout：设置命令超时时间。如果命令执行时间超时，子进程将被杀死，并弹出TimeoutExpired异常。

check：如果该参数设置为True，并且进程退出状态码不是0，则弹出CalledProcessError 异常。

encoding：如果指定了该参数，则stdin、stdout和stderr可以接收字符串数据，并以该编码方式编码。否则只接收bytes类型的数据。

shell：如果该参数为 True，将通过操作系统的shell执行指定的命令。

特别注意：
shell 参数默认值为 False，此时传递给 subprocess.run() 的命令必须先转换成列表或元组，python 社区这么考虑主要是基于安全方面的原因。当 shell=True， shell 默认为 /bin/sh。

常用参数示例如下：

subprocess.run(args, *,capture_output=False, shell=False, universal_newlines=None)

如果想直接调用命令，必须显示指定 shell=True

>>> import subprocess as sub
>>> import shlex
>>> cmd='date "+DATE: %Y-%m-%d%nTIME: %H:%M:%S"'
>>> args=shlex.split(cmd)					# 此方法用于将命令拆分成列表
>>> print(repr(cmd))
'date "+DATE: %Y-%m-%d%nTIME: %H:%M:%S"'
>>> print(repr(args))
['date', '+DATE: %Y-%m-%d%nTIME: %H:%M:%S']
>>> sub.run(cmd,shell=True)					# 要直接执行某命令，必须显示指定 shell=True
DATE: 2022-03-01
TIME: 11:00:58
CompletedProcess(args='date "+DATE: %Y-%m-%d%nTIME: %H:%M:%S"', returncode=0)
>>> sub.run(args)							# 如果不指定 shell=True，则必须转换成 list 或者 tuple 后再传参
DATE: 2022-03-01
TIME: 11:01:07
CompletedProcess(args=['date', '+DATE: %Y-%m-%d%nTIME: %H:%M:%S'], returncode=0)
>>> sub.run(args,capture_output=True)		# 使用 capture_output 参数可以捕获标准输出
CompletedProcess(args=['date', '+DATE: %Y-%m-%d%nTIME: %H:%M:%S'], returncode=0, stdout=b'DATE: 2022-03-01\nTIME: 11:05:05\n', stderr=b'')
>>> sub.run('a',shell=True,capture_output=True)	# 使用 capture_output 参数也可以捕获标准错误输出
CompletedProcess(args='a', returncode=127, stdout=b'', stderr=b'/bin/sh: a: command not found\n')

4.2 class subprocess.CompletedProcess

run() 方法的返回值，表示一个进程结束了。CompletedProcess 类有下面这些属性：

args：启动进程的参数，通常是个列表或字符串。

returncode：进程结束状态返回码。0表示成功状态。

stdout：获取子进程的 stdout。通常为 bytes 类型序列，None 表示没有捕获值。如果你在调用 run() 方法时，设置了参数 stderr=subprocess.STDOUT，则错误信息会和 stdout 一起输出，此时 stderr 的值是 None。

stderr：获取子进程的错误信息。通常为 bytes 类型序列，None 表示没有捕获值。

check_returncode()：用于检查返回码。如果返回状态码不为零，弹出CalledProcessError 异常。如果返回状态码为零，则返回 None。

>>> ret=sub.run('time date',shell=True,capture_output=True,universal_newlines=True)
>>> ret.stdout				# stdout 为属性，不是方法
'2022年 6月28日 星期二 10时39分58秒 CST\n'
>>> ret.stderr				# time 命令的输出信息会进入 stderr 中
'\nreal\t0m0.005s\nuser\t0m0.001s\nsys\t0m0.002s\n'
>>> ret.args
'time date'
>>> ret.returncode
0
>>> ret.check_returncode()
>>> 						# check_returncode()为方法，与run()方法中的 check=True 效果一样。返回码为零则返回 None
>>> ret=sub.run('a',shell=True,capture_output=True)
>>> ret.stderr
b'/bin/sh: a: command not found\n'
>>> ret.stdout				
b''
>>> ret.args
'a'
>>> ret.returncode
127
>>> ret.check_returncode()	# 返回状态码不为零，弹出`CalledProcessError` 异常
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 444, in check_returncode
    raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command 'a' returned non-zero exit status 127.

4.3 subprocess 编码

encoding：指定 encoding 后，会以指定的格式显示
universal_newline：指定为 True 后，标准输入, 标准输出和标准错误的文件对象将通过指定的 encoding 和 errors 以文本模式打开,text 形参为 python3.7 添加的 universal_newlines 的别名

>>> sub.run('date',shell=True)
2022年 3月 2日 星期三 11时15分23秒 CST	# 直接输出到终端会自动转码
CompletedProcess(args='date', returncode=0)
>>> ret=sub.run('date',shell=True,capture_output=True)
>>> ret						# stdout 和 stderr 均为 bytes 类型
CompletedProcess(args='date', returncode=0, stdout=b'2022\xe5\xb9\xb4 3\xe6\x9c\x88 2\xe6\x97\xa5 \xe6\x98\x9f\xe6\x9c\x9f\xe4\xb8\x89 11\xe6\x97\xb615\xe5\x88\x8601\xe7\xa7\x92 CST\n', stderr=b'')
>>> ret.stdout
b'2022\xe5\xb9\xb4 3\xe6\x9c\x88 2\xe6\x97\xa5 \xe6\x98\x9f\xe6\x9c\x9f\xe4\xb8\x89 11\xe6\x97\xb615\xe5\x88\x8601\xe7\xa7\x92 CST\n'	# 不转码输出的是 bytes 类型数据
>>> ret.stdout.decode('utf8')	# 转码后才能正常显示
'2022年 3月 2日 星期三 11时15分01秒 CST\n'
>>>ret=sub.run('date',shell=True,capture_output=True,encoding='utf8')
>>> ret
CompletedProcess(args='date', returncode=0, stdout='2022年 3月 2日 星期三 14时02分28秒 CST\n', stderr='')
>>> ret.stdout
'2022年 3月 2日 星期三 14时02分28秒 CST\n'
>>> ret=sub.run('date',shell=True,capture_output=True,text=True)		# universal_newlines 或 text 为 True 时，也会以文本模式打开
>>> ret
CompletedProcess(args='date', returncode=0, stdout='2022年 3月 2日 星期三 12时32分21秒 CST\n', stderr='')

4.3 `subprocess.Popen()` 类

完整参数如下：

class subprocess.Popen(args, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=True, shell=False, cwd=None, env=None, universal_newlines=None, startupinfo=None, creationflags=0, restore_signals=True, start_new_session=False, pass_fds=(), *, group=None, extra_groups=None, user=None, umask=-1, encoding=None, errors=None, text=None, pipesize=-1)

用法和参数与run()方法基本类同，但是它的返回值是一个 Popen 对象，而不是 CompletedProcess 对象。

注意其中多了一个 bufsize 参数（实测设置此参数大于 1 或等于 0 时无效，可能原因包括 tty 设备默认行缓冲，或者小于 io.DEFAULT_BUFFER_SIZE 时不生效等）：

bufsize 将在 open() 函数创建了 stdin/stdout/stderr 管道文件对象时起作用:

0：不使用缓冲区
1：表示行缓冲（只有 universal_newlines=True 时才有用，例如，在文本模式中）
正数：表示缓冲区大小
负数：表示使用系统默认的 io.DEFAULT_BUFFER_SIZE

因为 Popen 类是 run 函数调用的底层接口，所以 Popen 对象也拥有一些同名属性，但用法并不完全相同：
args：传递给 Popen 实例的参数序列或者字符串。
stdin：当参数为 PIPE 时，此属性是一个类似 open() 返回的可写的流对象。
stdout 和 stderr：当参数为 PIPE 时，此属性是一个类似 open() 返回的可读的流对象。需要使用文件读取方法，从流中读取子进程的输出。因此读取过一次后即到达流的末尾，无法再读取第二次。

注意：

尽量使用 communicate() 方法代替 .stdin.write，以避免 OS PIPE 缓冲区被子进程填满阻塞而导致的死锁问题。
如果 encoding 或 errors 参数被指定或者设定 universal_newlines=True，此流为文本流，否则为字节流。

pid：子进程的进程号。如果设置 shell=True，则返回的是生成的子 shell 的进程号。
returncode：子进程的退出码，只有在子进程终止后，再运行了 poll() 、wait() 或 communicate() 方法后才会被设置。None 值表示子进程仍未结束，负值表示子进程被信号 N 中断 (仅 POSIX)。

Popen() 类没有 capture_output 参数，如果需要捕获标准输出和标准错误输出，则需要显式指定 stdout 和 stderr 为 PIPE：

>>> ret=sub.Popen('date',shell=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.stdout.read().decode('utf8')		# 可以使用文件操作方法，若不设置 text=True，则也需要 decode()。
'2022年 3月 2日 星期三 16时56分18秒 CST\n'

Popen() 类的实例拥有以下方法：

Popen.poll()：检查子进程是否终止。如果终止了则返回 returncode，否则返回 None。

>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.poll()			# 子进程没结束时，运行 poll() 方法不会有返回值
>>> ret.returncode		
……
>>> ret.returncode		# 此时获取 returncode 的值一直返回 None
>>> ret.poll()			# 子进程结束后，运行 poll() 后才会获取返回值
0
>>> ret.returncode		# 此时 returncode 才被赋值
0
>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
# 在执行文件方法 read() 时会一直阻塞，直到子进程终止(无论是否被kill)。感觉是在等待文件的 close() 方法。
# 如果子进程一直输出，达到了系统 PIPE 的缓存大小的话，子进程会等待父进程读取 PIPE。
# 与文件读取时一样，read() 方法会直接一次性从头读取到流尾。
>>> ret.stdout.read()
'PING www.a.shifen.com (112.80.248.76): 56 data bytes\n64 bytes from 112.80.248.76: icmp_seq=0 ttl=54 time=41.518 ms\n64 bytes from 112.80.248.76: icmp_seq=1 ttl=54 time=82.775 ms\n64 bytes from 112.80.248.76: icmp_seq=2 ttl=54 time=78.468 ms\n64 bytes from 112.80.248.76: icmp_seq=3 ttl=54 time=39.839 ms\n64 bytes from 112.80.248.76: icmp_seq=4 ttl=54 time=74.613 ms\n\n--- www.a.shifen.com ping statistics ---\n5 packets transmitted, 5 packets received, 0.0% packet loss\nround-trip min/avg/max/stddev = 39.839/63.443/82.775/18.773 ms\n'
>>> ret.stdout.read()	# 因为已经到流的末尾，所以再使用 read() 方法就无法读出数据了
''
# 使用 readline() 方法则会一次读取一行，只要能读取成功，则不会阻塞主进程。
>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.stdout.readline()
'PING www.a.shifen.com (112.80.248.75): 56 data bytes\n'
>>> ret.stdout.readline()
'64 bytes from 112.80.248.75: icmp_seq=0 ttl=54 time=93.536 ms\n'
>>> ret.stdout.readline()
'64 bytes from 112.80.248.75: icmp_seq=1 ttl=54 time=85.324 ms\n'
>>> ret.stdout.readline()
'64 bytes from 112.80.248.75: icmp_seq=2 ttl=54 time=80.758 ms\n'
>>> ret.stdout.readline()
^[[A'64 bytes from 112.80.248.75: icmp_seq=3 ttl=54 time=74.980 ms\n'
>>> ret.stdout.readline()
'64 bytes from 112.80.248.75: icmp_seq=4 ttl=54 time=70.744 ms\n'
>>> ret.stdout.readline()
'\n'
>>> ret.stdout.readline()
'--- www.a.shifen.com ping statistics ---\n'
>>> ret.stdout.readline()
'5 packets transmitted, 5 packets received, 0.0% packet loss\n'
>>> ret.stdout.readline()
'round-trip min/avg/max/stddev = 70.744/81.068/93.536/7.966 ms\n'
>>> ret.stdout.readline()			# 读取到流尾后就无法再读取到数据了
''
>>> ret.stdout.readline()
''
>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.pid
15717
>>> ret.kill()
>>> ret.pid				# 无论子进程是否已终止，均可以获取到 pid
15717
>>> ret.returncode
>>> ret.poll()
-9
>>> ret.returncode		# returncode 为负数说明是异常终止
-9
>>>

Popen.wait(timeout=None)：等待子进程终止，并返回 returncode。如果在 timeout 秒后子进程未终止，则抛出一个 TimeoutExpired异常，但不会杀死子进程，可以安全地捕获此异常并重新等待。此方法会阻塞主程序，直到 timeout 或子进程终止。

注意：如果设置了 stdout=PIPE 或者 stderr=PIPE，并且子进程输出了大量数据到 PIPE 中，达到了系统 PIPE 的缓存大小的话，子进程会等待父进程读取 PIPE。而如果父进程正处于执行了 Popen.wait() 的阻塞状态的话，将会产生死锁。当使用 PIPE 时用 Popen.communicate() 来规避死锁。

>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.wait()			# 会一直阻塞主进程，直到子进程终止，并返回 returncode。如果子进程输出数据达到 PIPE 缓存大小，则会产生死锁。
0
>>> ret.returncode
0

Popen.communicate(input=None, timeout=None)：
与 Popen.wait(timeout=None) 功能一样，等待进程终止并设置 returncode 。但多了一个进程交互的功能，将数据发送到 stdin，并可以从 stdout 和 stderr 读取数据。
input 参数为可选，若指定则需要设置 stdin=PIPE。要获取任何非 None 的 stdout 和 stderr，同样需要设置 stdout=PIPE 和 stderr=PIPE。
communicate() 返回一个 (stdout_data, stderr_data) 元组。同时会读取到文件流的末尾，因此再次使用 stdout.read() 方法无法获取到数据。
如果指定 universal_newlines=True，则输入输出均为字符串，否则为 bytes 类型。

注意：

可以设置 input 参数的命令必须是支持交互式的命令，例如 python3,chronyc 等，不支持交互式的命令设置了 input 参数无效。
与 wait() 相同，在 timeout 秒后子进程未终止，则抛出一个 TimeoutExpired异常，但不会杀死子进程。

在使用 communicate() 时，为了正确清理子进程，代码示例如下：

proc = subprocess.Popen(...)
try:
    outs, errs = proc.communicate(timeout=15)	# timeout 到期后子进程不会被清理
except TimeoutExpired:	# 若要保证子进程被清理，需要手动捕获 timeout 异常
    proc.kill()			# 同时手动清理子进程
    outs, errs = proc.communicate()

子进程所产生的 stdout 和 stderr 全部缓存在内存中。因此如果输出的数据尺寸过大或无限，会超过内存大小，则不要使用此方法。
即使子进程结束后，也可以多次再调用communicate() 获取结果。

>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.communicate(timeout=5)			# 超时后会抛出 `TimeoutExpired` 异常，但不会终止子进程
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1024, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1867, in _communicate
    self._check_timeout(endtime, orig_timeout, stdout, stderr)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 1068, in _check_timeout
    raise TimeoutExpired(
subprocess.TimeoutExpired: Command 'ping -i 5 -c 5 www.baidu.com' timed out after 5 seconds
>>> ret.communicate()			# 未设置 timeout 时会一直阻塞，直到子进程终止
('PING www.a.shifen.com (112.80.248.75): 56 data bytes\n64 bytes from 112.80.248.75: icmp_seq=0 ttl=54 time=43.759 ms\n64 bytes from 112.80.248.75: icmp_seq=1 ttl=54 time=80.352 ms\nRequest timeout for icmp_seq 2\n64 bytes from 112.80.248.75: icmp_seq=3 ttl=54 time=77.637 ms\n64 bytes from 112.80.248.75: icmp_seq=4 ttl=54 time=40.920 ms\n\n--- www.a.shifen.com ping statistics ---\n5 packets transmitted, 4 packets received, 20.0% packet loss\nround-trip min/avg/max/stddev = 40.920/60.667/80.352/18.380 ms\n', '')
>>> ret.stdout.read()			# 使用过一次 `communicate()` 后，再次使用 `read()` 方法无法获取数据
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file.
>>> ret.communicate()			# 数据仍然保存在 `communicate()` 方法中
('PING www.a.shifen.com (112.80.248.75): 56 data bytes\n64 bytes from 112.80.248.75: icmp_seq=0 ttl=54 time=43.759 ms\n64 bytes from 112.80.248.75: icmp_seq=1 ttl=54 time=80.352 ms\nRequest timeout for icmp_seq 2\n64 bytes from 112.80.248.75: icmp_seq=3 ttl=54 time=77.637 ms\n64 bytes from 112.80.248.75: icmp_seq=4 ttl=54 time=40.920 ms\n\n--- www.a.shifen.com ping statistics ---\n5 packets transmitted, 4 packets received, 20.0% packet loss\nround-trip min/avg/max/stddev = 40.920/60.667/80.352/18.380 ms\n', '')
>>> ret=sub.Popen('ping -i 5 -c 5 www.baidu.com',shell=True,universal_newlines=True,stdout=sub.PIPE,stderr=sub.PIPE)
>>> ret.kill()				# 异常终止不影响读取已经产生的数据
>>> ret.stdout.read()
'PING www.a.shifen.com (112.80.248.76): 56 data bytes\n64 bytes from 112.80.248.76: icmp_seq=0 ttl=54 time=46.363 ms\n'
>>> ret.communicate()		# 使用 `read()` 后 `communicate()` 也获取不到数据了
('', '')
>>>

通过 communicate() 函数，可以像使用 shell 的管道一样，直接连接多个子进程的输入与输出。但是，这种输入输出，也跟 shell 管道一样，是一次性的。如果某个程序运行时需要连续多次获取输入，communicate() 就无法办到。

>>> p1 = subprocess.Popen(['df', '-Th'], stdout=subprocess.PIPE)
>>> p2 = subprocess.Popen(['grep', 'data'], stdin=p1.stdout, stdout=subprocess.PIPE)
>>> out,err = p2.communicate()
>>> print(out)
/dev/vdb1      ext4      493G  4.8G  463G   2% /data
/dev/vdd1      ext4     1008G  420G  537G  44% /data1
/dev/vde1      ext4      985G  503G  432G  54% /data2
 
>>> print(err)
None

Popen.send_signal(signal)：将信号 signal 发送给子进程。如果子进程已终止则不做任何操作。

Popen.terminate()：停止子进程。在 POSIX 操作系统上，此方法会发送 SIGTERM 给子进程（相当于 kill -15 PID）。在 Windows 上则会调用 Win32 API 函数 TerminateProcess() 来停止子进程。

Popen.kill()：杀死子进程。在 POSIX 操作系统上，此函数会发送 SIGKILL 给子进程（相当于 kill -9 PID）。在 Windows 上 kill() 则是 terminate() 的别名。

Popen 对象支持通过 with 语句作为上下文管理器，在退出时关闭文件描述符并等待进程:

with Popen(["ifconfig"], stdout=PIPE) as proc:
	log.write(proc.stdout.read())

4.4 `wait()` 与 `communicate()` 的异同：

Popen 方法	可重复调用	返回内容	阻塞主进程	死锁原因	支持 stdin
`wait()`	是	returncode	是	PIPE 满	否
`communicate()`	是	(stdout,stderr)	是	内存满	是

Linux 默认 PIPE size 大小为 64KiB，可通过以下代码测试 PIPE size：

#!/usr/bin/env python3
import subprocess

def test(size):
    print('start')

    cmd = 'dd if=/dev/urandom bs=1 count=%d 2>/dev/null' % size
    p = subprocess.Popen(args=cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    #p.communicate()
    p.wait()

    print('end')

# 64KB
test(64 * 1024)

# 64KB + 1B
test(64 * 1024 + 1)

运行结果：

start
end
start

如果一直从 PIPE 中迭代获取数据，理论上虽然可以避免使用 wait() 产生死锁，如下示例：

p = subprocess.Popen(["ls","-R"],stdout=subprocess.PIPE)
for line in p.stdout:
    # do something with the line
p.wait()

但由于 stdout 与 stderr 进入的是同一个 PIPE。如果 stdout 和 stderr 均产生大量数据，如下示例：

p = subprocess.Popen(["gcc","-c"]+mega_list_of_files,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

此时即使使用了 output = p.stdout.read()，由于 stderr 产生了大量数据，PIPE 用满后，仍然会出现死锁。
communicate() 使用了多线程，可以同时处理 stdout 和 stderr。因此在这种场景下不会产生死锁：

p = subprocess.Popen(["gcc","-c"]+mega_list_of_files,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
output,error = p.communicate()
return_code = p.wait()

参考：
Python subprocess.Popen中communicate()和wait()区别
 The difference between Python subprocess. Popen communicate() and wait()
Reproducing deadlock while using Popen.wait()

4.5 一图总结 `subprocess.Popen()` 类

转自：python中的subprocess.Popen（）使用
参数：
在这里插入图片描述
属性和方法：

5. 使用场景示例

5.1 执行一个命令，将结果直接输出到终端

#!/usr/bin/env python3
import subprocess
import shlex

COMMAND='date'
command=shlex.split(COMMAND)
# 使用 subprocess.run() 函数
subprocess.run(command)
# 使用 subprocess.Popen() 类
subprocess.Popen(command)

5.2 执行一个命令，获取返回码和返回值

#!/usr/bin/env python3
import subprocess
import shlex

COMMAND = 'date'
command = shlex.split(COMMAND)

# 使用 subprocess.run() 函数
ret = subprocess.run(
    command,
    capture_output=True,	# 此参数仅在 Python3.7 及以上版本支持
    universal_newlines=True,
)
print(
    f'return code is {repr(ret.returncode)}\n'
    f'stdout is {repr(ret.stdout)}\n'
    f'stderr is {repr(ret.stderr)}\n'
)

# 使用 subprocess.Popen() 类
ret = subprocess.Popen(
    command,
    universal_newlines=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
)
ret.wait()
print(
    f'return code is {repr(ret.returncode)}\n'
    f'stdout is {repr(ret.stdout.read())}\n'
    f'stderr is {repr(ret.stderr.read())}\n'
)

5.3 执行一个命令，实时观察输出（例如实时打印 ping 命令的输出）

场景一：只需要观察 stdout ，不需要观察 stderr：

#!/usr/bin/env python3
import subprocess
import shlex

COMMAND = 'ping -c 5 www.baidu.com'
command = shlex.split(COMMAND)
# 只能使用 subprocess.Popen() 类实现
ret = subprocess.Popen(
    command,
    universal_newlines=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
)
for line in ret.stdout:
    print(line,end='')

场景二：需要同时观察 stdout 和 stderr：
cmd.sh 脚本示例如下：

zou@node1:~$ cat cmd.sh
aaa
ping -c 3 www.baidu.com
bbb

#!/usr/bin/env python3
import subprocess
import shlex

command = './cmd.sh'
ret = subprocess.Popen(
    command,
    shell=True,
    universal_newlines=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
)
while True:
    line = ret.stdout.readline()
    if not line:
        break
    print(line,end='')

5.4 执行一个命令或者脚本，返回一个生成器，用于实时处理输出

#!/usr/bin/env python3
import subprocess
import shlex

def run_command(command):
    with subprocess.Popen(
            command,
            universal_newlines=True,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE, ) as proc:
        while True:
            line = proc.stdout.readline()
            if not line:
                proc.poll()
                if proc.returncode:
                    print(f"{command} running error: {proc.stderr.read()}")
                break
            yield line


def main():
    COMMAND = 'ping -c 5 www.baidu.com'
    command = shlex.split(COMMAND)
    for output in run_command(command):
        print(output, end='')


if __name__ == '__main__':
    main()

5.5 执行交互式命令，与某个命令自动交互

#!/usr/bin/env python3
import subprocess

command = 'nslookup'
stdin_str = 'www.baidu.com\nexit()\n'

print(repr(stdin_str))
ret = subprocess.Popen(
    command,
    shell=True,
    universal_newlines=True,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
)
out, err = ret.communicate(stdin_str)
print(out)

上面的代码相当于在命令行执行命令 nslookup ，然后手动输入：
www.baidu.com
exit

5.6 执行交互式命令，与某个命令手动交互

非常容易产生死锁，貌似没有万全的方式，可参见：
Python subprocess与命令行交互

5.7 完整使用示例

import subprocess, shlex

def os_command(command, print_output=True, shell=False):
    """
    Run an OS command (utility function) and returns a generator
    with each line of the stdout.

    In case of error, the sterr is forwarded through the exception.

    For the arguments, see run_os_command.
    If you are not sur between os_command and run_os_command,
    then the second is likely for you.
    """
    ENCODING = 'UTF-8'
    if isinstance(command, str):
        # if a string, split into a list:
        command = shlex.split(command)
    # we need a proper context manager for Python 2:
    Popen = subprocess.Popen
    # Process:
    with Popen(command,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.PIPE,
                    shell=shell) as process:
        while True:
            line = process.stdout.readline()
            if not line:
                # check error:
                process.poll()
                errno = process.returncode
                if errno:
                    # get the error message:
                    stderr_msg = str(process.stderr.read(), ENCODING)
                    errmsg = "Call of '%s' failed with error %s\n%s" % \
                                            (command, errno, stderr_msg)
                    raise OSError(errno, errmsg)
                break
            line = str(line.rstrip(), ENCODING)
            if print_output:
                print(repr(line))
            yield line

def run_os_command(command, print_output=True, shell=False):
    """
    Execute a command, printing as you go (unless you want to suppress it)

    Arguments:
    ----------
        command: eithr a string, a list containing each element of the command
            e.g. ['ls', '-l']
        print_output: print the results as the command executes
            (default: True)
        shell: call the shell; this activates globbing, etc.
            (default: False, as this is safer)

    Returns:
    --------
        A string containing the stdout
    """
    r = list(os_command(command, print_output=print_output, shell=shell))
    return "\n".join(r)


def os_get(command, shell=False):
    """
    Execute a command as a function

    Arguments:
    ----------
        command: a list containing each element of the command
            e.g. ['ls', '-l']
        shell: call the shell; this activates globbing, etc.
            (default: False)

    Returns:
    --------
        A string containing the output
    """
    return run_os_command(command, print_output=False, shell=shell)


def main():
    """
    The key is to realize that there are, really, four ways of calling an OS command from a high level language:
    1. as a command: you want to print the output as it comes
    2. as a function: you want no printed output, but a result in the form of a string
    3. as a function with side effect: you want to execute the command, watch what it does, and then analyse the output.
    4. as an ongoing process: you want to get every returned line as soon as it comes and do something with it.
    """
    # Case 1: Command
    run_os_command('ping -c 3 www.baidu.com')
    # Case 2: Function(a string)
    r = os_get(['ls'])
    print(r)
    # Which is really:
    r = run_os_command(['ls'], print_output=False)
    # Case 3: Function with side effect (also printing)
    r = run_os_command(['ls'])
    print(repr(r))
    # Case 4: Get a generator and do something with it
    for line in os_command(['ping', '-c 5', 'www.baidu.com'], print_output=False):
        print("Look at what just happened:", line)
    # By default, it will print the lines as it goes, if you want to suppress that and do your own print,
    # you have to set the print_output flag to False. run_os_command('ls -l') r = os_get(['ls']) print(repr(r))



if __name__ == '__main__':
    main()

zzboat0422

关注

5
点赞
踩
37

收藏

觉得还不错? 一键收藏
1
评论
【Python3 高级篇】5. subprocess 子进程管理，取代 os.popen()/os.system()

在多年的发展过程中，Python 演化出了许多种运行 shell 命令的方式，然而，对于当今 Python3.x （3.5 及之后的版本）来说，官方建议的，最好用且功能最全的调用 shell 命令方式，应该就是内置的subprocess模块。其他模块，如（只能获取返回码），os.popen()（只能获取返回值）等方法均流行于 Python2.x 时代，已经不再发展，可以放弃了。......
复制链接

扫一扫