hadoop-streaming调用Python脚本解析ua

1.从环境中找到hadoop-streaming-2.3.0-mr1-cdh5.1.2.jar的包

[root@ayu python]$ cd $HADOOP_HOME && find ./ -name "*streaming*"
./share/doc/hadoop-streaming
./share/doc/hadoop-mapreduce1/streaming.pdf
./share/doc/hadoop-mapreduce1/streaming.html
./share/doc/hadoop-mapreduce1/api/org/apache/hadoop/streaming
./share/doc/api/org/apache/hadoop/streaming
./share/hadoop/mapreduce1/contrib/streaming
./share/hadoop/mapreduce1/contrib/streaming/hadoop-streaming-2.3.0-mr1-cdh5.1.2.jar
./share/hadoop/tools/lib/hadoop-streaming-2.3.0-cdh5.1.2.jar
./share/hadoop/tools/sources/hadoop-streaming-2.3.0-cdh5.1.2-sources.jar
./share/hadoop/tools/sources/hadoop-streaming-2.3.0-cdh5.1.2-test-sources.jar
./cloudera/patches/0051-MR1-CLOUDERA-BUILD.-hadoop-streaming-has-wrong-versi.patch
./cloudera/patches/0092-MR1-CLOUDERA-BUILD.-Publish-hadoop-streaming-jar.patch
./src/hadoop-mapreduce1-project/ivy/hadoop-streaming-pom-template.xml
./src/hadoop-mapreduce1-project/cloudera/maven-packaging/hadoop-streaming
./src/hadoop-mapreduce1-project/src/contrib/streaming
./src/hadoop-mapreduce1-project/src/contrib/streaming/src/test/org/apache/hadoop/streaming
./src/hadoop-mapreduce1-project/src/contrib/streaming/src/java/org/apache/hadoop/streaming
./src/hadoop-mapreduce1-project/src/docs/src/documentation/content/xdocs/streaming.xml
./src/hadoop-tools/hadoop-streaming
./src/hadoop-tools/hadoop-streaming/src/test/java/org/apache/hadoop/streaming
./src/hadoop-tools/hadoop-streaming/src/main/java/org/apache/hadoop/streaming

2.目录包含三部分,jar包,python脚本和hadoop-streaming启动脚本目录结构

3.启动脚本,运行hadoop-streaming

[root@ayu python-ua]$ cat only-ua-shell.sh

hadoop jar hadoop-streaming-2.3.0-mr1-cdh5.1.2.jar \
-D mapred.job.name="ua_parse" \
-D mapred.map.tasks=1500 \
-files ua_180523_test_mr.py,rewrite_ua_parser.py,MANUFACTURER.py \
-mapper ua_180523_test_mr.py \
-input $1 \
-output $2 \
-numReduceTasks 1500
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值