Python第一次运行分布式计算程序

 

1.     Python第一次运行分布式计算程序.

(1)    在Linux终端执行以下命令:

cat inputFile.txt | python mrMeanMapper.py

(2)    在windows系统下,可以在DOS窗口输入以下命令:

Python mrMeanMapper.py < inputFile.txt

执行结果如下,但是遇到一些问题:(目前还没找到解决办法)

要将目录切换到python的安装目录,否则报错:'python'不是内部或外部命令,也不是可运行的程序或批处理文件。

要执行和读取的文件(.py和.txt文档)也必须放到这个安装目录下面

 

 

(1)    同时运行map和reduce:

Linux: cat inputFile.txt | python mrMeanMapper.py | pythonmrMeanReducer.py

windows:python mrMeanMapper.py < inputFile.txt | python mrMeanReducer.py

运行时,mapperOut是如下二维列表:

 

很明显第二个列表的元素无法转换为float, 故原代码运行会出错

 

#for instance in mapperOut: # nj = float(instance[0]) # cumN += nj # cumVal += nj*float(instance[1]) # cumSumSq += nj*float(instance[2])

 

自己修改代码为:要使用strip()去掉字符串首尾空格,否则也可能报错

 

instance = mapperOut[0] nj = float(instance[0].strip()) cumN += nj cumVal += nj*float(instance[1].strip()) cumSumSq += nj*float(instance[2].strip())

 

 

贴出书上源代码如下:

 

#mrMeanMapper.py import sys import numpy as np #æè¡è¯»åè¾å¥ def read_input(file): for line in file: #rshrip()å é¤å­ç¬¦ä¸²æ«å°¾çæå®å­ç¬¦ yield line.rstrip() inputs = read_input(sys.stdin)#creates a list of input lines inputs = [float(line) for line in inputs] #overwrite with floats numInputs = len(inputs) inputs = np.mat(inputs) sqInput = np.power(inputs,2) ##output size, mean, mean(square values) print("%d\t%f\t%f" % (numInputs, np.mean(inputs), np.mean(sqInput))) #calc mean of columns ##print(>> sys.stderr, "report: still alive") print(sys.stderr, "report: still alive") #mrMeanReducer.py import sys import numpy as np def read_input(file): for line in file: yield line.rstrip() input = read_input(sys.stdin)#creates a list of input lines #split input lines into separate items and store in list of lists mapperOut = [line.split('\t') for line in input] #accumulate total number of samples, overall sum and overall sum sq cumVal=0.0 cumSumSq=0.0 cumN=0.0 #for instance in mapperOut: # nj = float(instance[0]) # cumN += nj # cumVal += nj*float(instance[1]) # cumSumSq += nj*float(instance[2]) #mapperOutæ¯ä¸ä¸ªå«æ两个å表çäºç»´å表ï¼ç¬¬äºä¸ªå表æ æ³è½¬æ¢ä¸ºæµ®ç¹æ° #使ç¨strip()å»æé¦å°¾ç©ºæ ¼ï¼å¦åå¯è½æ¥éã instance = mapperOut[0] nj = float(instance[0].strip()) cumN += nj cumVal += nj*float(instance[1].strip()) cumSumSq += nj*float(instance[2].strip()) #calculate means mean = cumVal/(cumN+1) meanSq = cumSumSq/(cumN+1) #output size, mean, mean(square values) print("%d\t%f\t%f" % (cumN, mean, meanSq)) print(sys.stderr, "report: still alive")

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值