I wrote a small python script to parse/extract info from a PDF. I tested it on my local machine, I have python 2.6.2 and pdftotext version 0.12.4.
I am trying to run this on my webhosting server (dreamhost). It has python version 2.5.2 and pdftotext version 3.02.
But when I try to run the script I get the following error at the pdftotext line ( I have checked it with a simple throw away script as well) "Error: Couldn't open file '-'"
def ConvertPDFToText(currentPDF):
pdfData = currentPDF.read()
tf = os.tmpfile()
tf.write(pdfData)
tf.seek(0)
if (len(pdfData) > 0) :
out, err = subprocess.Popen(["pdftotext", "-layout", "-", "-"], stdin = tf, stdout=subprocess.PIPE ).communicate()
return out
else :
return None
Note that I am pass this function the same PDF file and it does have access to it. In another function I can email myself the PDF document from the same script running on the webhost.
What am I doing wrong? What is the possible difference in usage for subprocess/python/pdftext between my local version and the webhost version? I am guessing I will have to modify the command, so any help would be greatly appreciated.
Thanks in advance.
解决方案
Can the pdftotext read from the command line directly on webhost? Can you verify this? Also, why don't you pass the name of the temporary file as an argument rather than give it on standard input? (repasting here as per your suggestion).