hive在YARN下执行mr任务问题

2 篇文章 0 订阅
错误和解决方案都可以参加该链接:
http://grokbase.com/p/cloudera/cdh-user/126wqvfwyt/hive-refuses-to-work-with-yarn

现象就是,当hive启动mr任务是报错如下:

[quote]
Error during job, obtaining debugging information...
Examining task ID: task_1347634167505_0004_m_000000 (and more) from job job_1347634167505_0004
Exception in thread "Thread-19" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)
at org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:42)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:209)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:92)
at java.lang.Thread.run(Thread.java:722)
Execution failed with exit status: 2
[/quote]


链接中的解释很清楚
[quote]
There are actually two failures here:

1) The MR job that Hive launched on your cluster failed for some reason. I
can't determine why based on the information provided. I recommend trying
to locate the task logs for the failed tasks on the cluster.

2) When a job fails Hive attempts to automatically retrieve the task logs
from the JobTracker's TaskLogServlet. This service doesn't exist in MR2,
which is why Hive is throwing an exception (either because
mapred.job.tracker is undefined, or because it can't find the
TaskLogServlet service running on the machine that mapred.job.tracker
points to). This is a known issue and one that we plan to address in the
next release of CDH.
[/quote]

解决方法也很详细,就是在hive-site.xml中添加设置,
[quote]
In the meantime I recommend doing the following if you need to run Hive on
MR2:
* Keep Hive happy by setting mapred.job.tracker to a bogus value.
* Disable task log retrieval by setting
hive.exec.show.job.failure.debug.info=false
[/quote]

然后就ok啦。
是的,Hive可以将任务提交到YARN集群上运行。默认情况下,Hive使用MapReduce作为执行引擎,可以通过设置Hive配置参数来将任务提交到YARN集群上运行。具体步骤如下: 1. 确认Hadoop集群和YARN集群已经正确配置并运行。 2. 在hive-site.xml中设置以下参数: ``` <property> <name>hive.execution.engine</name> <value>tez</value> </property> <property> <name>hive.tez.container.size</name> <value>1024</value> </property> <property> <name>hive.tez.java.opts</name> <value>-Xmx819m</value> </property> <property> <name>hive.tez.log.level</name> <value>INFO</value> </property> <property> <name>hive.server2.tez.initialize.default.sessions</name> <value>true</value> </property> <property> <name>hive.server2.tez.sessions.per.default.queue</name> <value>1</value> </property> <property> <name>hive.server2.tez.default.queues</name> <value>default</value> </property> <property> <name>hive.execution.engine</name> <value>mr</value> </property> <property> <name>hive.exec.submitviachild</name> <value>false</value> </property> <property> <name>mapreduce.jobtracker.address</name> <value>yarn-cluster</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hostname:port</value> </property> ``` 3. 将hive-exec-x.x.x.jar、hive-metastore-x.x.x.jar、hive-common-x.x.x.jar、hive-cli-x.x.x.jar、hive-service-x.x.x.jar和hive-service-rpc-x.x.x.jar添加到Hadoop的share/hadoop/mapreduce/lib目录中。 4. 在Hive执行任务时,可以使用以下命令将任务提交到YARN集群上运行: ``` set mapreduce.framework.name=yarn; set yarn.resourcemanager.address=<resourcemanager_address>; ``` 其中,<resourcemanager_address>是YARN资源管理器的地址。 这样就可以将Hive任务提交到YARN集群上运行了。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值