I am trying to run python script on Hadoop cluster using Hadoop Streaming for sentiment analysis.
The Same script I am running on Local machine which is running Properly and giving output.
to run on local machine I use this command.
$ cat /home/MB/analytics/Data/input/* | ./new_mapper.py
and to run on hadoop cluster I use below command
$ hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.5.0-mr1-cdh5.2.0.jar -mapper "python $PWD/new_mapper.py" -reducer "$PWD/new_reducer.py" -input /user/hduser/Test_04012015_Data/input/* -output /user/hduser/python-mr/out-mr-out
The Sample code of my script is
#!/usr/bin/env python
import sys
def main(argv):
## for line in sys.stdin:
## print line
for line in sys.stdin:
line = line.split(',')
t_