一、环境配置
由于集群至少需要三台服务器,我就拿上次做的MongoDB Master, Slave, Arbiter环境来做Hadoop集群。服务器还是ibmcloud 免费提供的。其中Arbiter在这里做的也是slave的角色。
Hostname |
IP |
Server Type |
Master |
192.168.0.28 |
Centos6.2 |
Slave |
192.168.0.29 |
Ubuntu14.04 |
Arbiter |
192.168.0.30 |
Ubuntu14.04 |
配置三台机器的Master hosts文件如下:
$
cat
/etc/hosts
127.0.0.1 localhost Database-Master localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.28 Database-Master master
192.168.0.29 Database-Slave slave
192.168.0.30 Database-Arbiter arbiter
Master机器有安装ansible,其他所需要的软件包地址:
http://apache.fayea.com/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz
http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.8/zookeeper-3.4.8.tar.gz
http://apache.opencas.org/hbase/1.2.0/hbase-1.2.0-bin.tar.gz
http://download.oracle.com/otn-pub/java/jdk/8u73-b02/jdk-8u73-linux-x64.tar.gz
java我解压缩到/usr/java/目录下,然后编辑环境变量.zshrc
export
JAVA_HOME=
/usr/java/jdk1
.8.0_73
export
PATH=$JAVA_HOME
/bin
:$PATH
export
CLASSPATH=.:$JAVA_HOME
/lib/dt
.jar:$JAVA_HOME
/lib/tool
.jar
然后重新加载,使变量生效, source .zshrc.
然后需要集群见无密码登录,此前做MongoDB实验的时候已经设置过,再次不在赘述。
二、Hadoop的安装和配置
1. 首先将刚才下载的hadoop-2.6.4.tar.gz文件解压到/home/ibmcloud/hadoop,然后编辑etc/hadoop/core-site.xml
<property>
<name>fs.default.name<
/name
>
<value>hdfs:
//master
:9000<
/value
>
<
/property
>
<
/configuration
>
2. 添加JAVA_HOME变量到hadoop-env.sh
export
JAVA_HOME=
/usr/java/jdk1
.8.0_73
3. hdfs-site.xml
<configuration>
<property>
<name>dfs.name.
dir
<
/name
>
<value>
/home/ibmcloud/hadoop/name
<
/value
>
<
/property
>
<property>
<name>dfs.data.
dir
<
/name
>
<value>
/home/ibmcloud/hadoop/data
<
/value
>
<
/property
>
<property>
<name>dfs.replication<
/name
>
<value>3<
/value
>
<
/property
>
<
/configuration
>
4. 将mapred-site.xml.template 改名mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker<
/name
>
<value>master:9001<
/value
>
<
/property
>
<
/configuration
>
5. add master and slave
echo
"master"
>~
/hadoop/etc/hadoop/master
echo
-e
"slave\narbiter"
>~
/hadoop/etc/hadoop/slaves
6. copy hadoop folder to slave and arbiter
ansible all -m copy -a "src=hadoop dest=~ '
7. 启动hadoop集群
第一次执行,需要格式化namenode,以后启动不需要执行此步骤。
hadoop
/bin/hadoop
-
format
然后启动hadoop
hadoop
/sbin/start-all
.sh
启动完成后,如果没有什么错误,执行jps查询一下当前进程,NameNode是Hadoop Master进程,SecondaryNameNode,ResourceManager是Hadoop进程。
$ jps
23076 NameNode
20788 ResourceManager
23302 SecondaryNameNode
27559 Jps
三、ZooKeeper集群安装
1. 解压缩zookeeper-3.4.8.tar.gz并重命名zookeeper, 进入zookeeper/conf目录,cp zoo_sample.cfg zoo.cfg 并编辑