oozie架构图
从oozie的架构图中,可以看到所有的任务都是通过oozie生成相应的任务客户端,并通过任务客户端来提交相应的任务;
对oozie的二次开发都集中在了oozie server那里,其实官网是有自定义例子:http://oozie.apache.org/docs/4.2.0/DG_CustomActionExecutor.html ,但如果了解oozie项目的代码架构有助于二次开发和调错,二次开发的步骤大概有三个步骤:
a.设计workflow流程且设计xsd schema文件
b.编写action的代码
c.部署action jar包
代码架构
oozie的服务架构如上图,但整个服务程序是由tomcat来启动,所有服务和事件的接口都在org.apache.oozie.servlet,所有的服务单例都是由org.apache.oozie.servlet.ServicesLoader来启动,看下图:
org.apache.oozie.service 的包主要是用于各功能的管理服务的单例服务的实现
编写action的代码
(先说如何部署自定义action)
部署action jar包
对于部署自定义action,我是通过分析oozie的启动脚本日记入手的,当然我使用的是ambari启动oozie,启动日记里面有一段如下的日志,这是故意调错的日志
//解压oozie.war文件
u"Execute['cd /var/tmp/oozie && /usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war']" {
'not_if': 'ls /datas/hadoop/hdp/oozie/run/oozie.pid >/dev/null 2>&1 && ps -p `cat /datas/hadoop/hdp/oozie/run/oozie.pid` >/dev/null 2>&1', 'user': 'oozie'}
2015-08-28 09:47:54,538 - Error while executing command 'restart':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 371, in restart
self.start(env)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_server.py", line 60, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie_server.py", line 53, in configure
oozie(is_server=True)
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie.py", line 101, in oozie
oozie_server_specific()
File "/var/lib/ambari-agent/cache/common-services/OOZIE/4.0.0.2.0/package/scripts/oozie.py", line 193, in oozie_server_specific
not_if = no_op_test
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 274, in action_run
raise ex
Fail: Execution of 'cd /var/tmp/oozie && /usr/hdp/current/oozie-server/bin/oozie-setup.sh prepare-war' returned 255. setting OOZIE_CONFIG=${OOZIE_CONFIG:-/etc/oozie/conf}
setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server}
setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie}
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
setting JAVA_HOME=/usr/local/jdk
setting JRE_HOME=${JAVA_HOME}
setting OOZIE_LOG=/datas/hadoop/hdp/oozie/log
setting CATALINA_PID=/datas/hadoop/hdp/oozie/run/oozie.pid
setting OOZIE_DATA=/datas/hadoop/hdp/oozie/data
setting OOZIE_HTTP_PORT=11000
setting OOZIE_ADMIN_PORT=11001
setting JAVA_LIBRARY_PATH=/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
setting OOZIE_CLIENT_OPTS="${OOZIE_CLIENT_OPTS} -Doozie.connection.retry.count=5 "
setting CATALINA_OPTS="${CATALINA_OPTS} -Xmx2048m -XX:MaxPermSize=256m "
setting OOZIE_CONFIG=${OOZIE_CONFIG:-/etc/oozie/conf}
setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server}
setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie}
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
setting JAVA_HOME=/usr/local/jdk
setting JRE_HOME=${JAVA_HOME}
setting OOZIE_LOG=/datas/hadoop/hdp/oozie/log
set