Consider the following Python code:
import io
import time
import subprocess
import sys
from thread import start_new_thread
def ping_function(ip):
filename = 'file.log'
command = ["ping", ip]
with io.open(filename, 'wb') as writer, io.open(filename, 'rb', 1) as reader:
process = subprocess.Popen(command, stdout=writer)
while process.poll() is None:
line = reader.read()
# Do something with line
sys.stdout.write(line)
time.sleep(0.5)
# Read the remaining
sys.stdout.write(reader.read())
ping_function("google.com")
The goal is to run a shell command (in this case ping, but it is not relevant here) and to process the output in real time, which is also saved on a log file.
In other word, ping is running in background and it produces output on the terminal every second. My code will read this output (every 0.5 seconds), parse it and take some action in (almost) real time.
Realtime here means that I don't want to wait the end of the process to read the output. In this case actually ping never completes so an approach like the one I have just described is mandatory.
I have tested the code above and it actually works OK :)
Now I'd like to tun this in a separate thread, so I have replaced the last line with the following:
from thread import start_new_thread
start_new_thread(ping_function, ("google.com", ))
For some reason this does not work anymore, and the reader always return empty strings.
In particular, the string returned by reader.read() is always empty.
Using a Queue or another global variable is not going to help, because I am having problems even to retrieve the data in the first place (i.e. to obtain the output of the shell command)
My questions are:
How can I explain this behavior?
Is it a good idea to run a process inside a separate thread or I should use a different approach? This article suggests that it is not...
How can I fix the code?
Thanks!
解决方案
You should never fork after starting threads. You can thread after starting a fork, so you can have a thread handle the I/O piping, but...
Let me repeat this: You should never fork after starting threads
That article explains it pretty well. You don't have control over the state of your program once you start threads. Especially in Python with things going on in the background.
To fix your code, just start the subprocess from the main thread, then start threading. It's perfectly OK to process the I/O from the pipes in a thread.