关于Too many fetch-failures错误

问题现场:
[root@localhost local_input]# hadoop jar MyWordCount.jar MyWordCount input output4
11/12/07 16:03:29 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
11/12/07 16:03:30 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
11/12/07 16:03:30 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/12/07 16:03:30 INFO input.FileInputFormat: Total input paths to process : 7
11/12/07 16:03:30 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11/12/07 16:03:30 INFO mapreduce.JobSubmitter: number of splits:7
11/12/07 16:03:30 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null
11/12/07 16:03:30 INFO mapreduce.Job: Running job: job_201112071307_0003
11/12/07 16:03:31 INFO mapreduce.Job:  map 0% reduce 0%
11/12/07 16:03:45 INFO mapreduce.Job:  map 28% reduce 0%
11/12/07 16:03:46 INFO mapreduce.Job:  map 57% reduce 0%
11/12/07 16:03:52 INFO mapreduce.Job:  map 71% reduce 0%
11/12/07 16:03:54 INFO mapreduce.Job:  map 71% reduce 9%
11/12/07 16:03:57 INFO mapreduce.Job:  map 100% reduce 9%
11/12/07 16:04:35 INFO mapreduce.Job: Task Id : attempt_201112071307_0003_m_000003_0, Status : FAILED
Too many fetch-failures
11/12/07 16:04:35 WARN mapreduce.Job: Error reading task outputConnection refused
11/12/07 16:04:35 WARN mapreduce.Job: Error reading task outputConnection refused
11/12/07 16:04:39 INFO mapreduce.Job: map 85% reduce 9% 11/12/07 16:04:45 INFO mapreduce.Job: map 100% reduce 9% 11/12/07 16:05:00 INFO mapreduce.Job: map 100% reduce 23% 11/12/07 16:05:14 INFO mapreduce.Job: Task Id : attempt_201112071307_0003_m_000002_0, Status : FAILED Too many fetch-failures 11/12/07 16:05:14 WARN mapreduce.Job: Error reading task outputConnection refused 11/12/07 16:05:14 WARN mapreduce.Job: Error reading task outputConnection refused 11/12/07 16:05:18 INFO mapreduce.Job: map 85% reduce 23% 11/12/07 16:05:24 INFO mapreduce.Job: map 100% reduce 23%

问题原因:

         Reduce task启动后第一个阶段是shuffle,即向mapfetch数据。每次fetch都可能因为connect超时,read超时,checksum错误等原因而失败。Reduce task为每个map设置了一个计数器,用以记录fetchmap输出时失败的次数。当失败次数达到一定阈值时,会通知JobTracker fetchmap输出操作失败次数太多了,并打印如下log

Failedto fetch map-output from attempt_201105261254_102769_m_001802_0 evenafter MAX_FETCH_RETRIES_PER_MAP retries... reporting to the JobTracker

其中阈值计算方式为:

                      max(MIN_FETCH_RETRIES_PER_MAP,

                      getClosestPowerOf2((this.maxBackoff * 1000 / BACKOFF_INIT) + 1));

    默认情况下MIN_FETCH_RETRIES_PER_MAP=2maxBackoff=300 BACKOFF_INIT=4000因此默认阈值为6,可通过修改mapred.reduce.copy.backoff参数来调整。

    当达到阈值后,Reduce task通过umbilical协议告诉TaskTrackerTaskTracker在下一次heartbeat时,通知JobTracker。当JobTracker发现超过50%Reduce汇报fetch某个map的输出多次失败后,JobTrackerfailed掉该map并重新调度,打印如下log

"Too many fetch-failures for output of task: attempt_201105261254_102769_m_001802_0 ... killing it"

解决办法:

                                  暂无


  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值