Hadoop完全分布式集群环境搭建步骤

hadoop完全分布式集群环境搭建步骤

以下适合CentOS7以前的版本,如CentOS6.4CentOS6.5

1、准备多台机器:

   本次测试3台机器:

Ip:  192.168.140.131                                   192.168.140.132                                 192.168.140.133

主机名:hadoop-chenxiang01                          hadoop-chenxiang02                   hadoop-chenxiang03

 

2、修改主机名(3台机器都要修改):

         hostname + 主机名(临时生效)

         vi /etc/sysconfig/network   永久生效

3、修改主机名和ip地址映射(3台机器都要修改):

         vi /etc/hosts

                            192.168.140.131                        hadoop-chenxiang01                          localhost01

                            192.168.140.132                        hadoop-chenxiang02                          localhost02

                            192.168.140.133                        hadoop-chenxiang03                          localhost03

使用主机名是为了方便可以不用,直接用ip也可以

4、在/opt下创建目录app用于放集群(三台机器都要做)

                   mkdir /opt/app

         将该目录赋予某用户使用的权限:

                   chown  –R chenxiang:chenxiang  /opt/app

5、安装jdk(本次测试使用jdk7,三台都要安装)

         1)一般linux系统都自带有开源jdk,先将其卸载:

                   *查看jdk的已安装程序:rpm-qa | grep java

                   *卸载jdk: rpm  -e  --nodeps + 上一步输出的程序名

         2)将jdk解压到/opt/modules下,配置环境变量:vi  /etc/profile

                   *将如下内容放到该文件的最下面:

                            export JAVA_HOME=/opt/modules/jdk(jdk安装路径)

                            export PATH=$PATH:$JAVA_HOME/bin

                  *使用此命令生效:source  /etc/profile

                  *使用java  –version 检验是否安装成功

6、安装hadoop

         将hadoop-2.5.0.tar.gz解压到/opt/app下

                   tar –xzvf hadoop-2.5.0.tar.gz–C /opt/app

7、规划机器和服务

 

192.168.140.131

192.168.140.132

192.168.140.133

Hdfs

nameNode

 

 

dataNode

dataNode

dataNode

 

 

secondaryNameNode

yarn

 

resourceManager

 

nodeManager

nodeManager

nodeManager

mapReduce

 

 

 

jobHistoryServer

 

 

8、配置

         1)hdfs

                   *hadoop-env.sh

                            配置jdk: export JAVA_HOME=/opt/modules/jdk

                   *core-site.xml

                            (1)<property>

                       <name>fs.defaultFS</name>

<value>hdfs://hadoop-chenxiang01:8020</value> 指定nameNode所在主机

</property>

                            (2) 修改hadoop默认临时文件目录(先在hadoop安装目录下创建data/tmp)

                   <property>

                           <name>hadoop.tmp.dir</name>

                            <value>/opt/app/hadoop-2.5.0/data/tmp</value>

                   </property>

                            (3)垃圾回收机制

<property>

                        <name>fs.trash.interval</name>

                        <value>420</value>

                 </property>

                   *hdfs-site.xml

                            (1)指定secondaryNameNode所在主机

<property>

                        <name>dfs.namenode.secondary.http-address</name>

                        <value>hadoop-chenxiang03:50090</value>

                 </property>

                   *slaves

                            (1)指定dataNode所在主机(在三台主机上)

                                     hadoop-chenxiang01

hadoop-chenxiang02

hadoop-chenxiang03

         2)yarn

                   *yarn-env.sh

                            配置jdk: export JAVA_HOME=/opt/modules/jdk

                   *yarn-site.xml

                            (1)指定resourceManager所在主机

                                     <property>

                       <name>yarn.resourcemanager.hostname</name>

                       <value>hadoop-chenxiang02</value>

                   </property>

                            (2) <property>

                       <name>yarn.nodemanager.aux-services</name>

                       <value>mapreduce_shuffle</value>

               </property>

                            (3)

                                     <property>

                       <name>yarn.nodemanager.resource.memory-mb</name>

                       <value>4096</value>

                  </property>

                            (4)

                                     <property>

                       <name>yarn.nodemanager.resource.cpu-vcores</name>

                       <value>4</value>

               </property>

                            (4)

                                     <property>

                       <name>yarn.log-aggregation-enable</name>

                       <value>true</value>

                   </property>

                            (5)

                                     <property>

                       <name>yarn.log-aggregation.retain-seconds</name>

                       <value>64088</value>

                   </property>

                   *slaves

                            (1)指定NodeManager所在主机(在三台主机上)

                                     hadoop-chenxiang01

hadoop-chenxiang02

hadoop-chenxiang03

         3)mapReduce

                   *mapred-env.sh

                            配置jdk: export JAVA_HOME=/opt/modules/jdk

                   *mapend-site.xml

                            (1)

                                     <property>

                       <name>mapreduce.framework.name</name>

                       <value>yarn</value>

                   </property>

                            (2)

                                     <property>

                       <name>mapreduce.jobhistory.address</name>

                       <value>hadoop-chenxiang01:10020</value>

                   </property>

                            (3)

                                     <property>

                       <name>mapreduce.jobhistory.webapp.address</name>

                       <value>hadoop-chenxiang01:19888</value>

                   </property>

9、分发hadoop安装包到各个机器节点

         1)配置ssh无密钥登录

                   根据以上机器和服务规划需要在.131和.132上对另外机器设置无密钥登录

                   (1)、生成公私密钥

                            *进入到用户主目录下的.ssh文件夹下生成公钥和私钥:

              [root@localhost .ssh]# ssh-keygen -t rsa 回车4次(在131,132上做)

                *将生成的密钥复制到需要连接的主机

              [root@localhost .ssh]# ssh-copy-id + 主机名(3台远程主机名,包含自身)

                            以上两步均在131,132上操作

                                     测试是否能连接远程主机:ssh+ 主机名

         2)将hadoop分发的各个机器节点

进入到hadoop的安装目录(131)将hadoop的整个目录copy到另外两台机器相应的目录下:scp  -r ./hadoop-2.5.0/  账户名@主机名:/opt/app

注意:分发前进入hadoop目录将share下的doc及其下的文件全部删除,该目录下全部是一些文档,无用,删除后分发速度会更快

至此整个集群环境搭建完毕,下面就可以启动测试了

[root@localhost hadoop-2.5.0]# bin/hdfsnamenode –format      格式化

[root@localhost hadoop-2.5.0]#sbin/start-dfs.sh  启动namenode,datanode,secondarynamenode

[root@localhost hadoop-2.5.0]#sbin/start-yarn.sh启动resourcemanager,nodemanager

[root@localhost hadoop-2.5.0]#jps         查看进程

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值