经过前三篇文章的准备,我们终于要开始安装Haoop集群了,废话不多说了,赶快开始吧
1.JDK安装
1)卸载CentOS7自带的openJDK
#rpm –qa | grep jdk //查看系统已有的jdk
#yum –y remove java-1.8.0-openjdk-headless-1.8.0.102-4.b14.el7.x86_64
#yum –y remove java-1.8.0-openjdk-1.8.0.102-4.b14.el7.x86_64
#yum –y remove java-1.7.0-openjdk-headless-1.7.0.111-2.6.7.8.el7.x86_64
#yum –y remove java-1.7.0-openjdk-1.7.0.111-2.6.7.8.el7.x86_64
2)安装jdk1.8.0_131
#tar –zxvf jdk-8u131-linux-x64.tar.gz -C /usr/local/hadoopenv/java/ //解压jdk-8u131-linux-x64.tar.gz
查看
#rm -rf jdk-8u131-linux-x64.tar.gz //移除安装包
3)添加系统路径,修改环境变量
#vim /etc/bashrc
在文件末尾添加如下内容
export JAVA_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131
export JRE_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131/jre
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
4)检查jdk是否配置成功
$java -version
2.Hadoop安装
1)解压到指定目录
#tar -zxvf hadoop-2.7.3.tar.gz -C /usr/local/hadoopenv/
2)修改文件名
3) 修改文件权限
#chown –R hadoop:hadoop hadoop/
//将文件夹"hadoop"读权限分配给hadoop用户
4)删除hadoop-2.7.3.tar.gz 安装包
#rm –rf hadoop-2.7.3.tar.gz
5) 修改配置文件
(a)在/usr/local/hadoopenv/hadoop/目录下创建以下目录
$mkdir –p /usr/local/tmp/hadoopenv/dfs/name
$mkdir –p /usr/local/tmp/hadoopenv/dfs/data
$mkdir –p /usr/local/tmp/hadoopenv/dfs/temp
$mkdir –p /usr/local/tmp/hadoopenv/dfs/logs
$mkdir –p /usr/local/tmp/hadoopenv/dfs/pids
(b)修改core-site.xml
添加如下内容:
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoopenv/tmp/hadoop/ </value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
(c)修改hadoop-env.sh
在文件末尾添加
export JAVA_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131
export HADOOP_LOG_DIR= /usr/local/hadoopenv/tmp/hadoop/dfs/logs
export HADOOP_PID_DIR= /usr/local/hadoopenv/tmp/hadoop/dfs/pids
(d)修改hdfs-site.xml
在文件中添加
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoopenv/tmp/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoopenv/tmp/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>namenode:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
(e)修改mapred-env.sh
#vim mapred-env.sh
添加如下内容
export JAVA_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131
export HADOOP_LOG_DIR= /usr/local/hadoopenv/tmp/hadoop/dfs/logs
export HADOOP_PID_DIR= /usr/local/hadoopenv/tmp/hadoop/dfs/pids
(f)修改mapred-site.xml
添加如下内容
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>namenode:10020</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>namenode:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>namenode:19888</value>
</property>
(g)修改slaves
修改文件内容
datanode1 //slave1的hostname名字
datanode2 //slave2的hostname名字
(h) 修改yarn-env.sh
在文件尾部添加如下内容
Export JAVA_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131
Export YARN_LOG_DIR=/usr/local/hadoopenv/
(i)修改yarn-site.xml文件
添加如下内容
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>namenode</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
6) 格式化文件系统
hdfs namenode –format
7)启动hdfs和yarn
start-dfs.sh
start-yarn.sh
3.zookeeper安装
1)解压,重命名
tar –zxvf /home/hadoop/zookeeper-3.4.8.tar.gz //解压缩
mv zookeeper-3.4.8 zookeeper //重命名
rm –rf /home/hadoop/zookeeper-3.4.8.tar.gz //删除安装包
2)修改zoo.cfg
#cp zoo_sample.cfg zoo.cfg
修改zoo.cfg
#vim zoo.cfg
添加以下内容
dataDir=/usr/local/hadoopenv/tmp/zookeeper/data
server.1=namenode:2888:3888
server.2=datanode1:2888:3888
server.3=datanode2:2888:3888
3)创建myid文件
mkdir -p /usr/local/hadoopenv/tmp/zookeeper/data
cd /usr/local/hadoopenv/tmp/zookeeper/data
touch myid
echo 1 > myid
4)将配置好的zookeeper文件夹整个拷贝到其他机器上
scp -r /usr/local/hadoopenv/zookeeper/ root@datanode1:/usr/local/hadoopenv/
scp -r /usr/local/hadoopenv/tmp/zookeeper/data/ root@datanode1: /usr/local/hadoopenv/tmp/zookeeper/
scp -r /usr/local/hadoopenv/zookeeper/ root@datanode2:/usr/local/hadoopenv/
scp -r /usr/local/hadoopenv/tmp/zookeeper/data/ root@datanode2: /usr/local/hadoopenv/tmp/zookeeper/
5)修改其他节点myid (与zoo.cfg中的配置相一致)
[root@datanode1 data]#echo 2 > myid
[root@datanode2 data]#echo 3 > myid
4.HBase安装
1)解压,重命名,删除压缩包
# tar -zxvf hbase-1.2.6-bin.tar.gz -C /usr/local/hadoopenv/
#mv /usr/local/hadoopenv/hbase-1.2.6 /usr/local/hadoopenv/hbase //重命名
#rm -rf hbase-1.2.6-bin.tar.gz
2)修改Hbase-env.sh
#vim hbase-en.sh
在文件尾部添加以下内容
export JAVA_HOME=/usr/local/hadoopenv/java/jdk1.8.0_131/
export HBASE_CLASSPATH=/usr/local/hadoopenv/hbase
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
3)修改hbase-site.xml
#vim hbase-site.xml
在文件中添加如下内容
<property>
<name>hbase.rootdir</name>
<value>hdfs://namenode:9000/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>namenode:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hadoopenv/tmp/zookeeper/data</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>namenode,datanode1,datanode2</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>150000</value>
</property>
<property>
<name>zookeeper.session.ticktime</name>
<value>6000</value>
</property>
4)修改regionservers
去掉localhost
添加以下内容
datanode1
datanode2
5.启动集群
1)启动Hadoop
start-dfs.sh
startyarn.sh
2)启动 Zookeeper
zkServer.sh start
3)启动HBase
start-hbase.sh
4)jps查看进程
5)网站查看
浏览器输入网址
namenode:16010
datanode1:50070