hadoop-oozie配合使用有各种蛋疼问题,大多跟配置有关,部分记录如下:
1.hadoop-2.5.0-cdh5.3.0 =》oozie4.4.0-cdh5.3.0
<1>oozie对hadoop的身份认证问题 User: xxx is not allowed to impersonate xxx
$HADOOP_HOME/etc/hadoop/core-site.xml中进行设置,我的设置如下
- <property>
- <name>hadoop.proxyuser.A.hosts</name>
- <value>oozie-ip</value>
- </property>
- <property>
- <name>hadoop.proxyuser.A.groups</name>
- <value>A-group</value>
- </property>
其中A是提交ooziejob的用户,oozie-ip是oozie机器所在的ip,A-group是A所在的组。
<2>oozie提交之后一直处于prep状态
job.properties中将jobtracker设置为hadoop的yarn的resource manager的端口,一般为8032
<3>oozie的map-reduce任务跑完后一直处于runnning,超过时间限制变为suspend,查看log有如下错误:
java.net.ConnectException: to 0.0.0.0:10020 failed on connection exception ,这是jobhistory服务没有开启
在hadoop的mapred中添加如下内容:
<property>
<name>mapreduce.jobhistory.address</name>
<value>ip:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ip:19888</value>
</property>
重启hadoop,执行:
$HADOOP_HOME/sbin/./mr-jobhistory-daemon.sh start historyserver
再执行ooziejob就好了
<4> 配置sharelib路径
<property>
<name>oozie.service.WorkflowAppService.system.libpath</name>
<value>hdfs://localhost:9000/user/${user.name}/share/lib</value>
<description>
System library path to use for workflow applications.
This path is added to workflow application if their job properties sets
the property 'oozie.use.system.libpath' to true.
</description>
</property>
./oozie-setup.sh sharelib create -fs hdfs://master:9000 -locallib /home/root/oozie-4.0.0-cdh5.3.0/oozie-sharelib-4.0.0-cdh5.3.0-yarn.tar.gz
<5> hive job导致的sharelib问题
IllegalArgumentException: Wrong FS: file:/user/root/share/lib/lib_20150427214702/hive/jersey-server-1.9.jar, expected: hdfs://localhost:9000
=> 在job.propertes中配置oozie.libpath=${nameNode}/user/${user.name}/share/lib/lib_20150427214702/hive
<6> oozie的任务提交了一直处于running状态,prep状态
=>检查job.properties里面jobtracker属性和yarn的resourcemanager值是否一致