程序:
mapper.py
import sys
for line in sys.stdin:
line = line.strip()
words = line.split()
print '%s\t%s' % (words[0],words[1])
import sys
count=0
i=0
sum=0
for line in sys.stdin:
line = line.strip()
name,score = line.split('\t', 1)
if i==0:
current_name=name
i=1
try:
score = int(score)
except ValueError:
continue
if current_name == name:
count += 1
sum +=score
else:
print '%s\t%s' % (current_name,sum/count)
current_name=name
sum=score
count=1
print '%s\t%s' % (current_name,sum/count)
运行过程: