hadoop-mapreduce
houzhizhen
专注大数据处理和分布式计算。
展开
-
Changing the Log4j debug with `hadoop jar` in MapReduce jobs
Short Description: The proper way to set the Log4j level on MapReduce jobs at launch time for Apache Hadoop 2.x. Article One of the most common questions I come across when trying to help debug Map...转载 2018-02-11 19:40:24 · 319 阅读 · 0 评论 -
Hadoop 2.7.5 MapReduce Recovery
Map reduce can recovery from last attempt, the succeeded map tasks or reduce tasks in previous application attempt will not be executed again.MRAppMaster.serviceInit calls processRecovery to recover...原创 2018-04-18 14:43:56 · 470 阅读 · 0 评论 -
Hadoop 2.7.5 MapReduce JobHistoryParser
JobHistoryParser//historyFile: hdfs://localhost:8020/tmp/hadoop-yarn/staging/houzhizhen/.staging/job_1523876398612_0014/job_1523876398612_0014_1.jhist FSDataInputStream in = open historyFile Jo...原创 2018-04-18 17:32:07 · 446 阅读 · 0 评论 -
hadoop compress file
compress files in directory to another directoryhadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-0.20.2-cdh3u2.jar \ -Dmapred.output.compress=true \ -Dmapred.compress.map.output=true...原创 2018-05-16 09:55:24 · 273 阅读 · 0 评论 -
MapReduce 在Shuffle阶段 内存溢出原因分析及处理方法
现象在Reduce运行中,有时出现内存溢出错误,抛出的异常信息如下:···Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffl...原创 2018-12-03 17:13:03 · 8137 阅读 · 4 评论