Hive Tez任务失败

最近集群上的Tez任务经常跑失败,报错信息见下:

出错日志

Map 1: 555(+41)/596 Reducer 2: 0(+0,-2)/1   
15/09/23 14:50:35 INFO SessionState: Map 1: 555(+41)/596    Reducer 2: 0(+0,-2)/1   
Map 1: 555(+41)/596 Reducer 2: 0(+1,-2)/1   
15/09/23 14:50:37 INFO SessionState: Map 1: 555(+41)/596    Reducer 2: 0(+1,-2)/1   
Map 1: 555(+41)/596 Reducer 2: 0(+1,-3)/1   
15/09/23 14:50:38 INFO SessionState: Map 1: 555(+41)/596    Reducer 2: 0(+1,-3)/1   
Map 1: 555(+41)/596 Reducer 2: 0(+1,-3)/1   
15/09/23 14:50:41 INFO SessionState: Map 1: 555(+41)/596    Reducer 2: 0(+1,-3)/1   
Map 1: 555(+0)/596  Reducer 2: 0(+0,-4)/1   
15/09/23 14:50:44 INFO SessionState: Map 1: 555(+0)/596 Reducer 2: 0(+0,-4)/1   
Status: Failed
15/09/23 14:50:45 ERROR SessionState: Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1442391298043_123239_1_01, diagnostics=[Task failed, taskId=task_1442391298043_123239_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Container container_1442391298043_123239_01_008650 finished with diagnostics set to [Container preempted internally]], TaskAttempt 1 failed, info=[Container container_1442391298043_123239_01_008771 finished with diagnostics set to [Container preempted internally]], TaskAttempt 2 failed, info=[Container container_1442391298043_123239_01_009010 finished with diagnostics set to [Container preempted internally]], TaskAttempt 3 failed, info=[Container container_1442391298043_123239_01_009723 finished with diagnostics set to [Container preempted internally]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1442391298043_123239_1_01 [Reducer 2] killed/failed due to:null]
15/09/23 14:50:45 ERROR SessionState: Vertex failed, vertexName=Reducer 2, vertexId=vertex_1442391298043_123239_1_01, diagnostics=[Task failed, taskId=task_1442391298043_123239_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Container container_1442391298043_123239_01_008650 finished with diagnostics set to [Container preempted internally]], TaskAttempt 1 failed, info=[Container container_1442391298043_123239_01_008771 finished with diagnostics set to [Container preempted internally]], TaskAttempt 2 failed, info=[Container container_1442391298043_123239_01_009010 finished with diagnostics set to [Container preempted internally]], TaskAttempt 3 failed, info=[Container container_1442391298043_123239_01_009723 finished with diagnostics set to [Container preempted internally]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1442391298043_123239_1_01 [Reducer 2] killed/failed due to:null]
Vertex killed, vertexName=Map 1, vertexId=vertex_1442391298043_123239_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1442391298043_123239_1_00 [Map 1] killed/failed due to:null]
15/09/23 14:50:45 ERROR SessionState: Vertex killed, vertexName=Map 1, vertexId=vertex_1442391298043_123239_1_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1442391298043_123239_1_00 [Map 1] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:1

分析:

task_1442391298043_123239_1_01_000000,失败了4次,失败的原因是container被高优先级的任务抢占了。而task最大的失败次数默认是4.当集群上的任务比较多时,比较容易出现这个问题。

解决方案:

修改默认值,

tez.am.task.max.failed.attempts=10
tez.am.max.app.attemps=5;
  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值