准备Oozie环境
Oozie版本:4.2.0,从源码手动编译安装包
源码包:oozie-4.2.0.tar.gz
tar -zxvf oozie-4.2.0.tar.gz -C $OOZIE_SRC_HOME
Building Oozie
cd $OOZIE_SRC_HOME
bin/mkdistro.sh -DskipTests -Phadoop-2 -Dhadoop.auth.version=2.6.0 -Ddistcp.version=2.6.0 -Dspark.version=2.0.2
编译得到安装包:oozie-4.2.0/distro/target/oozie-4.2.0-distro.tar.gz
安装OOzie
tar -zxvf oozie-4.2.0-distro.tar.gz -C $OOZIE_HOME
安装libext
在$OOZIE_HOME下新建libext目录
1、拷贝ExtJS library到libext/
2、拷贝Hadoop的相关jar包到该目录下
cp $HADOOP_HOME/share/hadoop/*/*.jar libext/
cp $HADOOP_HOME/share/hadoop/*/lib/*.jar libext/
3、拷贝mysql驱动到该目录下(默认数据库是derby)
cp mysql-connector-java-5.1.25-bin.jar libext/
oozie相关的配置修改
Hadoop的core-xite.xml
<property>
<name>hadoop.proxyuser.[USER].hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.[USER].groups</name>
<value>*</value>
</property>
其中,[USER]需要改为后面启动oozie tomcat的用户
不重启hadoop集群,而使配置生效
hdfs dfsadmin -refreshSuperUserGroupsConfiguration
yarn rmadmin -refreshSuperUserGroupsConfiguration
配置数据库连接
<property>
<name>oozie.service.JPAService.create.db.schema</name>
<value>true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://node4:3306/oozie?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>root</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>root</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/usr/hadoop/hadoop-2.6.0/etc/hadoop</value>
</property>
启动前初始化
a. 打war包
bin/oozie-setup.sh prepare-war
b. 初始化数据库
bin/ooziedb.sh create -sqlfile oozie.sql -run
c. 修改oozie-4.2.0/oozie-server/conf/server.xml文件,注释掉下面的记录
<!--<Listener className="org.apache.catalina.mbeans.ServerLifecycleListener" />-->
d. 上传jar包
bin/oozie-setup.sh sharelib create -fs hdfs://node1:8020
启动
bin/oozie-start.sh
检查Oozie是否正常启动:http://node3:11000/oozie/
准备Spark2.×环境
Spark版本:2.0.2,预编译的安装包:spark-2.0.2-bin-hadoop2.7.tgz
Spark安装
Oozie Spark2支持
1、新建spark2的共享目录
hdfs dfs -mkdir /user/oozie/share/lib/lib_<ts>/spark2
2、上传spark2.*的依赖jar包到spark2的共享目录
hdfs dfs -put \
/usr/spark/jars/* \
/user/oozie/share/lib/lib_<ts>/spark2/
3、copy oozie-sharelib-spark的jar包到spark2共享目录
hdfs dfs -cp \
/user/oozie/share/lib/lib_<ts>/spark/oozie-sharelib-spark-<version>.jar \
/user/oozie/share/lib/lib_<ts>/spark2/