您的preprocess.py文件正在尝试读取sys.argv [1]表单并将其作为文件打开.
如果将-h传递给命令行,则会尝试使用该名称打开文件.
拆分命令行解析处理
您的预处理函数不关心命令行参数,它应该将打开的文件描述符作为参数.
因此,在解析命令行参数之后,您应该注意提供文件描述符,在您的情况下它将是sys.stdin.
使用docopt的示例解决方案
argparse没有任何问题,我最喜欢的解析器是docopt,我将用它来说明命令行解析的典型拆分,准备最终函数调用和最终函数调用.您也可以使用argparse实现相同的功能.
首先安装docopt:
$pip install docopt
这是fromstdin.py代码:
"""fromstdin - Training and Testing Framework
Usage: fromstdin.py [options]
Options:
--text= Text model [default: text.txt]
--features= Features model [default: features.txt]
--test= Testing set [default: testset.txt]
--vectorizer= The vectorizec [default: vector.txt]
Read data from file. Use "-" for reading from stdin.
"""
import sys
def main(fname, text, features, test, vectorizer):
if fname == "-":
f = sys.stdin
else:
f = open(fname)
process(f, text, features, test, vectorizer)
print "main func done"
def process(f, text, features, test, vectorizer):
print "processing"
print "input parameters", text, features, test, vectorizer
print "reading input stream"
for line in f:
print line.strip("
")
print "processing done"
if __name__ == "__main__":
from docopt import docopt
args = docopt(__doc__)
print args
infile = args[""]
textfile = args["--text"]
featuresfile = args["--features"]
testfile = args["--test"]
vectorizer = args["--vectorizer"]
main(infile, textfile, featuresfile, testfile, vectorizer)
可以这样称呼:
$python fromstdin.py
Usage: fromstdin.py [options]
显示帮助:
$python fromstdin.py -h
fromstdin - Training and Testing Framework
Usage: fromstdin.py [options]
Options:
--text= Text model [default: text.txt]
--features= Features model [default: features.txt]
--test= Testing set [default: testset.txt]
--vectorizer= The vectorizec [default: vector.txt]
Read data from file. Use "-" for reading from stdin.
使用它,从stdin喂养:
(so)javl@zen:~/sandbox/so/cmd$ls | python fromstdin.py -
{'--features': 'features.txt',
'--test': 'testset.txt',
'--text': 'text.txt',
'--vectorizer': 'vector.txt',
'': '-'}
processing
input parameters text.txt features.txt testset.txt vector.txt
reading input stream
bcmd.py
callit.py
fromstdin.py
scrmodule.py
processing done
main func done