1.Oozie
2.Oozie
- Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
- Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
- Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availability.触发任务
- Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
- Oozie is a scalable, reliable and extensible system.(可扩展、可信赖的、可扩充的系统)
- Developers interested in getting more involved with Oozie may join the mailing lists, report bugs, retrieve code from the version control system, and make contributions.
3、三大服务
Budle(对Coordinator的封装) 、Coordinator(时间和时间的触发)、Workflow9(工作流)。
Budle能够执行 start、stop、 suspend、resume、rerun a set coordinators jobs
4、Oozie 安装部署
所需要的系统环境
- Unix (tested in Linux and Mac OS X)
- Java 1.7+
- Hadoop
- Apache Hadoop (tested with 1.2.1 & 2.4.0+)
- ExtJS library (optional, to enable Oozie webconsole)
1.需要配置in Hadoop core-site.xml:替换大写字母,配置好后必须重启hadoop
<property>
<name>hadoop.proxyuser.[OOZIE_SERVER_USER].hosts</name>
<value>[OOZIE_SERVER_HOSTNAME]</value>
</property>
<property>
<name>hadoop.proxyuser.[OOZIE_SERVER_USER].groups</name>
<value>[USER_GROUPS_THAT_ALLOW_IMPERSONATION]</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/etc/hadoop</value>
<description>
Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
the relevant Hadoop *-site.xml files. If the path is relative is looked within
the Oozie configuration directory; though the path can be absolute (i.e. to point
to Hadoop client conf/ directories in the local filesystem.
</description>
</property>
3.Create a libext/ directory in the directory where Oozie was expanded.
4.If using a version of Hadoop bundled in Oozie hadooplibs/ , copy the corresponding Hadoop JARs from hadooplibs/ to the libext/ directory. If using a different version of Hadoop, copy the required Hadoop JARs from such version in the libext/ directory.
If using the ExtJS library copy the ZIP file to the libext/ directory.
下边的命令配置Oozie所有组件的命令
$ bin/oozie-setup.sh prepare-war [-d directory] [-secure]
sharelib create -fs <FS_URI> [-locallib <PATH>]
sharelib upgrade -fs <FS_URI> [-locallib <PATH>]
db create|upgrade|postupgrade -run [-sqlfile <FILE>]
URI不需要加路径的
# 创建war包,会自动配置到Oozie server中的webapp中。
bin/oozie-setup.sh prepare-war
- # 创建sharelib 文件到hdfs中。
$ bin/oozie-setup.sh sharelib create -fs hdfs://hadoop.com:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
通过hdfs的NameNode的50070页面结果 创建的目录是以时间我目录进行的创建
创建数据库
$ bin/ooziedb.sh create -sqlfile oozie.sql -run DB Connection.
$ bin/oozied.sh start
$ bin/oozied.sh run
$ bin/oozie admin -oozie http://hadoop.com:11000/oozie -status
http://hadoop.com:11000