目录结构
Hadoop集群(CDH4)实践之 (0) 前言
Hadoop集群(CDH4)实践之 (1) Hadoop(HDFS)搭建
Hadoop集群(CDH4)实践之 (2) HBase&Zookeeper搭建
Hadoop集群(CDH4)实践之 (3) Hive搭建
Hadoop集群(CHD4)实践之 (4) Oozie搭建
Hadoop集群(CHD4)实践之 (5) Sqoop安装
本文内容
Hadoop集群(CHD4)实践之 (4) Oozie搭建
参考资料
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/CDH4-Installation-Guide.html
环境准备
OS: CentOS 6.4 x86_64
Servers:
hadoop-master: 172.17.20.230 内存10G
- namenode
- hbase-master
hadoop-secondary: 172.17.20.234 内存10G
- secondarybackupnamenode,jobtracker
- hive-server,hive-metastore
- oozie
hadoop-node-1: 172.17.20.231 内存10G sudo yum install hbase-regionserver
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
hadoop-node-2: 172.17.20.232 内存10G
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
hadoop-node-3: 172.17.20.233 内存10G
- datanode,tasktracker
- hbase-regionserver,zookeeper-server
对以上角色做一些简单的介绍:
namenode - 整个HDFS的命名空间管理服务
secondarynamenode - 可以看做是namenode的冗余服务
jobtracker - 并行计算的job管理服务
datanode - HDFS的节点服务
tasktracker - 并行计算的job执行服务
hbase-master - Hbase的管理服务
hbase-regionServer - 对Client端插入,删除,查询数据等提供服务
zookeeper-server - Zookeeper协作与配置管理服务
hive-server - Hive的管理服务
hive-metastore - Hive的元存储,用于对元数据进行类型检查与语法分析
oozie - Oozie是一种Java Web应用程序,用于工作流的定义和管理
本文定义的规范,避免在配置多台服务器上产生理解上的混乱:
以下操作都只需要在 Oozie 所在主机,即 hadoop-secondary 上执行。
1. 安装前的准备
Hadoop集群(CDH4)实践之 (3) Hive搭建
2. 安装Oozie
$ sudo yum install oozie oozie-client
3. 创建Oozie数据库
$ mysql -uroot -phiveserver
1 | mysql> create database oozie; |
2 | mysql> grant all privileges on oozie.* to 'oozie' @ 'localhost' identified by 'oozie' ; |
3 | mysql> grant all privileges on oozie.* to 'oozie' @ '%' identified by 'oozie' ; |
4.配置oozie-site.xml
$ sudo vim /etc/oozie/conf/oozie-site.xml
001 | <? xml version = "1.0" ?> |
004 | < name >oozie.service.ActionService.executor.ext.classes</ name > |
006 | org.apache.oozie.action.email.EmailActionExecutor, |
007 | org.apache.oozie.action.hadoop.HiveActionExecutor, |
008 | org.apache.oozie.action.hadoop.ShellActionExecutor, |
009 | org.apache.oozie.action.hadoop.SqoopActionExecutor, |
010 | org.apache.oozie.action.hadoop.DistcpActionExecutor |
014 | < name >oozie.service.SchemaService.wf.ext.schemas</ name > |
015 | < value >shell-action-0.1.xsd,shell-action-0.2.xsd,email-action-0.1.xsd,hive-action-0.2.xsd,hive-action-0.3.xsd,hive-action-0.4.xsd,hive-action-0.5.xsd,sqoop-action-0.2.xsd,sqoop-action-0.3.xsd,ssh-action-0.1.xsd,ssh-action-0.2.xsd,distcp-action-0.1.xsd</ value > |
018 | < name >oozie.system.id</ name > |
019 | < value >oozie-${user.name}</ value > |
022 | < name >oozie.systemmode</ name > |
023 | < value >NORMAL</ value > |
026 | < name >oozie.service.AuthorizationService.security.enabled</ name > |
030 | < name >oozie.service.PurgeService.older.than</ name > |
034 | < name >oozie.service.PurgeService.purge.interval</ name > |
038 | < name >oozie.service.CallableQueueService.queue.size</ name > |
042 | < name >oozie.service.CallableQueueService.threads</ name > |
046 | < name >oozie.service.CallableQueueService.callable.concurrency</ name > |
050 | < name >oozie.service.coord.normal.default.timeout |
056 | < name >oozie.db.schema.name</ name > |
060 | < name >oozie.service.JPAService.create.db.schema</ name > |
065 | < name >oozie.service.JPAService.jdbc.driver</ name > |
066 | < value >com.mysql.jdbc.Driver</ value > |
069 | < name >oozie.service.JPAService.jdbc.url</ name > |
073 | < name >oozie.service.JPAService.jdbc.username</ name > |
077 | < name >oozie.service.JPAService.jdbc.password</ name > |
082 | < name >oozie.service.JPAService.pool.max.active.conn</ name > |
087 | < name >oozie.service.HadoopAccessorService.kerberos.enabled</ name > |
091 | < name >local.realm</ name > |
092 | < value >LOCALHOST</ value > |
095 | < name >oozie.service.HadoopAccessorService.keytab.file</ name > |
096 | < value >${user.home}/oozie.keytab</ value > |
099 | < name >oozie.service.HadoopAccessorService.kerberos.principal</ name > |
100 | < value >${user.name}/localhost@${local.realm}</ value > |
103 | < name >oozie.service.HadoopAccessorService.jobTracker.whitelist</ name > |
107 | < name >oozie.service.HadoopAccessorService.nameNode.whitelist</ name > |
112 | < name >oozie.service.HadoopAccessorService.hadoop.configurations</ name > |
113 | < value >*=/etc/hadoop/conf</ value > |
116 | < name >oozie.service.WorkflowAppService.system.libpath</ name > |
117 | < value >/user/${user.name}/share/lib</ value > |
121 | < name >use.system.libpath.for.mapreduce.and.pig.jobs</ name > |
126 | < name >oozie.authentication.type</ name > |
127 | < value >simple</ value > |
130 | < name >oozie.authentication.token.validity</ name > |
134 | < name >oozie.authentication.signature.secret</ name > |
139 | < name >oozie.authentication.cookie.domain</ name > |
144 | < name >oozie.authentication.simple.anonymous.allowed</ name > |
149 | < name >oozie.authentication.kerberos.principal</ name > |
150 | < value >HTTP/localhost@${local.realm}</ value > |
154 | < name >oozie.authentication.kerberos.keytab</ name > |
155 | < value >${oozie.service.HadoopAccessorService.keytab.file}</ value > |
159 | < name >oozie.authentication.kerberos.name.rules</ name > |
160 | < value >DEFAULT</ value > |
164 | < name >oozie.service.ProxyUserService.proxyuser.oozie.hosts</ name > |
169 | < name >oozie.service.ProxyUserService.proxyuser.oozie.groups</ name > |
174 | < name >oozie.service.ProxyUserService.proxyuser.hue.hosts</ name > |
178 | < name >oozie.service.ProxyUserService.proxyuser.hue.groups</ name > |
183 | < name >oozie.action.mapreduce.uber.jar.enable</ name > |
187 | < name >oozie.service.HadoopAccessorService.supported.filesystems</ name > |
188 | < value >hdfs,viewfs</ value > |
5. 配置Oozie Web Console
$ cd /tmp/
$ wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
$ cd /var/lib/oozie/
$ sudo unzip /tmp/ext-2.2.zip
$ cd ext-2.2/
$ sudo -u hdfs hadoop fs -mkdir /user/oozie
$ sudo -u hdfs hadoop fs -chown oozie:oozie /user/oozie
6. 配置Oozie ShareLib
$ mkdir /tmp/ooziesharelib
$ cd /tmp/ooziesharelib
$ tar xzf /usr/lib/oozie/oozie-sharelib.tar.gz
$ sudo -u oozie hadoop fs -put share /user/oozie/share
$ sudo -u oozie hadoop fs -ls /user/oozie/share
$ sudo -u oozie hadoop fs -ls /user/oozie/share/lib
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/hbase.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/zookeeper.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.5.0.jar /user/oozie/share/lib/hive/
$ sudo -u oozie hadoop fs -put /usr/lib/hive/lib/guava-11.0.2.jar /user/oozie/share/lib/hive/
$ sudo ln -s /usr/share/java/mysql-connector-java.jar /var/lib/oozie/mysql-connector-java.jar
7. 启动Oozie
$ sudo service oozie start
8. 访问Oozie Web Console
http://hadoop-secondary:11000/oozie
9. 至此,Oozie的搭建就已经完成。