系统环境,请查看《virtualbox5.0.8 centos6.7 mini 安装》
一、.规划
3台主机,内存3G,IP: 192.168.56.70 192.168.56.71 192.168.56.72
vi /etc/hosts
192.168.56.70 localhost (各主机IP)
192.168.56.70 hadoop1
192.168.56.71 hadoop2
192.168.56.72 hadoop3
hadoop1 NameNode SecondaryNameNode DataNode
hadoop2 DataNode
hadoop3 DataNode
二、软件
jdk-8u11-linux-x64.tar.gz
hadoop-2.7.3.tar.gz
hadoop-2.7.3-src.tar.gz
hbase-1.2.4-bin.tar.gz
spark-2.0.2-bin-hadoop2.7.tgz
zookeeper-3.4.9.tar.gz
上传至/usr目录
三、安装
1.jdk
请参考《centos6.7 mini 安装oracle jdk 1.8》
2.免密ssh
vi /etc/ssh/sshd_config
去掉以下两项的#
RSAAuthentication yes
PubkeyAuthentication yes
ssh-keygen -t rsa
(一路回车,不输密码)
ssh-copy-id -i /root/.ssh/id_rsa.pub root@hadoop1
ssh-copy-id -i /root/.ssh/id_rsa.pub root@hadoop2
ssh-copy-id -i /root/.ssh/id_rsa.pub root@hadoop3
各主机都执行,并分发到其他主机
ssh 各主机验证是否免密
3.hadoop
mkdir /home/tmp
mkdir /home/hadoop
mkdir /home/hadoop/namenode
mkdir /home/hadoop/datanode
tar -zxvf hadoop-2.7.3.tar.gz
配置环境变量
vi /etc/profile
在最后增加
export JAVA_HOME=/usr/java/jdk1.8.0_11
export JRE_HOME=$JAVA_HOME/jre
export CLASS_PATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export HADOOP_HOME=/usr/hadoop-2.7.3
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
使环境变量生效
source /etc/profile
vi /usr/hadoop-2.7.3/etc/hadoop/slaves
删除原来的,新增以下:
hadoop2
hadoop3
hadoop1
vi /usr/hadoop-2.7.3/etc/hadoop/core-site.xml
在configuration节点增加
<property>
<name>fs.defaultFS</name>//namenode的地址
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/tmp</value>
</property>
vi /usr/hadoop-2.7.3/etc/hadoop/hdfs-site.xml
在configuration节点增加
<property>
<name>dfs.namenode.name.dir</name> //namenode数据存放地址
<value>file:/home/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name> //datanode数据存放地址
<value>file:/home/hadoop/datanode</value>
</property>
<property>
<name>dfs.replication</name> //副本
<value>2</value>
</property>
vi /usr/hadoop-2.7.3/etc/hadoop/mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
vi /usr/hadoop-2.7.3/etc/hadoop/yarn-site.xml
在configuration节点增加
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name> //yarn的界面
<value>hadoop1:8088</value>
</property>
vi /usr/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
在# The java implementation to use. 下面修改JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_11
vi /usr/hadoop-2.7.3/etc/hadoop/yarn-env.sh
在# some Java parameters 下面修改JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_11
hadoop namenode -format
cd /usr/hadoop-2.7.3/sbin/
start-dfs.sh
jps
主节点发现有3个就算正常了
SecondaryNameNode
NameNode
DataNode
从节点
DataNode
http://192.168.56.70:50070/dfshealth.html#tab-overview
主节点启动yarn
start-yarn.sh
每个节点jps发现多了就是正常:
QuorumPeerMain
http://192.168.56.70:8088/cluster
4.zookeeper
tar -zxvf zookeeper-3.4.9.tar.gz
cd zookeeper-3.4.9
mkdir data
cd conf/
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
修改
dataDir=/usr/zookeeper-3.4.9/data
最后增加
server.0=hadoop1:2888:3888
server.1=hadoop2:2888:3888
server.2=hadoop3:2888:3888
cd ../data/
vi myid
0
cd ..
scp -r zookeeper-3.4.9 root@hadoop2:/usr
scp -r zookeeper-3.4.9 root@hadoop3:/usr
每个节点,修改myid文件,记录server.X的X
如主节点是0,hadoop2是1,hadoop3是2
每个节点:
vi /etc/profile
在最后增加
export ZOOKEEPER_HOME=/usr/zookeeper-3.4.9
export PATH=$PATH:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf
source /etc/profile
每个节点:
启动
zkServer.sh start
查看状态
zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
ZooKeeper JMX enabled by default
Using config: /usr/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
ZooKeeper JMX enabled by default
Using config: /usr/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader
三个节点:一个节点是leader,两个节点是follower
5.hbase
tar -zxvf hbase-1.2.4-bin.tar.gz
cd hbase-1.2.4/
mkdir logs
cd conf/
vi hbase-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_11
export HBASE_MANAGES_ZK=true
export HBASE_LOG_DIR=/usr/hbase-1.2.4/logs
vi hbase-site.xml
configuration节点增加:
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop1:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop1,hadoop2,hadoop3</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
<description>Time difference of regionserver from master</description>
</property>
vi regionservers
hadoop1
hadoop2
hadoop3
cd ..
scp -r hbase-1.2.4 root@hadoop2:/usr
scp -r hbase-1.2.4 root@hadoop3:/usr
cd hbase-1.2.4/bin/
./start-hbase.sh
jps
hadoop1多了:
HMaster
HRegionServer
hadoop2,hadoop3多了:
HRegionServer
http://192.168.56.70:16010/master-status
6.spark
tar -zxvf spark-2.0.2-bin-hadoop2.7.tgz、
cd spark-2.0.2-bin-hadoop2.7/conf/
cp spark-env.sh.template spark-env.sh
vi spark-env.sh
最后增加:
export JAVA_HOME=/usr/java/jdk1.8.0_11
export HADOOP_HOME=/usr/hadoop-2.7.3
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HBASE_HOME=/usr/hbase-1.2.4
cp slaves.template slaves
vi slaves
hadoop2
hadoop3
cd ../sbin/
启停spark
./start-all.sh
./stop-all.sh
启动后:
jps
hadoop1多了:
Master
hadoop2,hadoop3多了:
Worker
http://192.168.56.70:8080/