hadoop 集群与搭建步骤

1.集群搭建预处理(主机配置)

准备至少2台机器

ip                      hostname

172.19.0.1         hserver1
           172.18.2.32       hserver2
           192.43.2.31       hserver3

a). 你可以改变你准备机器的hostname,这个步骤不是必须的,这样做只要是便于识别机器

        hostnamectl set-hostname hserver1(172.19.0.1)
                   hostnamectl set-hostname hserver2(172.18.2.32)
                   hostnamectl set-hostname hserver3(192.43.2.31)

b).在你准备的机器每一台机器做如下配置

[root@hserver1 ~]# vim /etc/hosts

添加如下参数:

 172.19.0.1      hserver1
            172.18.2.32     hserver2
            192.43.2.31     hserver3

2.创建相应的用户

例如:hadoop(每台机器都需要分别执行)

[root@hserver1 ~]# useradd -m hadoop    // 创建hadoop用户

[root@hserver1 ~]# echo 123456 | passwd --stdin hadoop    // 为hadoop用户设置密码为123456

3.免密设置

每台机器都需要分别执行

例如:第一台机器hserver1

[root@hserver1 ~]# su hadoop    //  切换至hadoop用户

[hadoop@hserver1 ~]# ssh-keygen -t rsa    // 生成密钥

[hadoop@hserver1 ~]# ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@hserver1   //  将hadoop密钥分别加到对应机器上,包括自己

[hadoop@hserver1 ~]# ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@hserver2

[hadoop@hserver1 ~]# ssh-copy-id -i /home/hadoop/.ssh/id_rsa.pub hadoop@hhserver3

        验证:ssh hadoop@hserver2   // 不用输入密码

其他机器依次上诉操作

4.将相应的hadoop包和java包移至hadoop用户目录

         我的hadoop包为:hadoop-2.9.2

         我的java包为:jdk1.8.0_171

         加权限,因为很可能hadoop包与java包权限并不是hadoop用户

    [root@hserver1 ~]# chown -R hadoop:hadoop /home/hadoop/hadoop-2.9.2

    [root@hserver1 ~]# chown -R hadoop:hadoop /home/hadoop/jdk1.8.0_171

5.创建hadoop的data目录(后面的操作均是在hadoop用户下操作,不是root下)

        mkdir -p /home/hadoop/hadoop
        mkdir -p /home/hadoop/hadoop/tmp
        mkdir -p /home/hadoop/hadoop/var
        mkdir -p /home/hadoop/hadoop/dfs
        mkdir -p /home/hadoop/hadoop/dfs/name
        mkdir -p /home/hadoop/hadoop/dfs/data

6.配置hadoop配置文件

 a).core-site.xml

 [hadoop@hserver1 hadoop-2.9.2]# vim etc/hadoop/core-site.xml

 <configuration>
        <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoop/tmp</value>
        <description>Abase for other temporary directories.</description>
   </property> 
   <property>
        <name>fs.default.name</name>
        <value>hdfs://hserver1:8888</value>
   </property>
 </configuration>

b).hdfs-site.xml 
 

[hadoop@hserver1 hadoop-2.9.2]# vim etc/hadoop/hdfs-site.xml 
   <name>dfs.name.dir</name>
   <value>/home/hadoop/hadoop/dfs/name</value>
        </property>
        <property>
           <name>dfs.data.dir</name>
           <value>/home/hadoop/hadoop/dfs/data</value>
           <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its bls.</description>
        </property>
        <property>
           <name>dfs.replication</name>
           <value>2</value>
        </property>
        <property>
              <name>dfs.permissions</name>
              <value>false</value>
              <description>need not permissions</description>
        </property>
        <property>
            <name>dfs.namenode.http-address</name>
            <value>hserver1:8118</value>
        </property>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>hserver3:8119</value>
        </property>
</configuration>

c)mapred-site.xml

[hadoop@hserver1 hadoop-2.9.2]# vim etc/hadoop/mapred-site.xml

<configuration>


          <property>
              <name>mapred.job.tracker</name>
              <value>hserver1:49001</value>
          </property>

          <property>
              <name>mapred.local.dir</name>
              <value>/home/hadoop/hadoop/var</value>
          </property>

          <property>
              <name>mapreduce.framework.name</name>
              <value>yarn</value>
          </property>

</configuration>


                 

d).slaves

[hadoop@hserver1 hadoop-2.9.2]# vim etc/hadoop/slaves

hserver1

hserver2

hserver3

e).yarn-site.xml

[hadoop@hserver1 hadoop-2.9.2]# vim etc/hadoop/yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
    <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>hserver1</value>
    </property>
    <property>
          <description>The address of the applications manager interface in the RM.</description>
          <name>yarn.resourcemanager.address</name>
          <value>${yarn.resourcemanager.hostname}:8032</value>
    </property>
    <property>
          <description>The address of the scheduler interface.</description>
          <name>yarn.resourcemanager.scheduler.address</name>
          <value>${yarn.resourcemanager.hostname}:8030</value>
    </property>
    <property>
          <description>The http address of the RM web application.</description>
          <name>yarn.resourcemanager.webapp.address</name>
          <value>${yarn.resourcemanager.hostname}:8088</value>
    </property>
    <property>
          <description>The https adddress of the RM web application.</description>
          <name>yarn.resourcemanager.webapp.https.address</name>
          <value>${yarn.resourcemanager.hostname}:8090</value>
    </property>
    <property>
          <name>yarn.resourcemanager.resource-tracker.address</name>
          <value>${yarn.resourcemanager.hostname}:8031</value>
    </property>
    <property>
          <description>The address of the RM admin interface.</description>
          <name>yarn.resourcemanager.admin.address</name>
          <value>${yarn.resourcemanager.hostname}:8033</value>
    </property>
    <property>
          <name>yarn.nodemanager.aux-services</name>
          <value>mapreduce_shuffle</value>
    </property>
    <property>
          <name>yarn.scheduler.maximum-allocation-mb</name>
          <value>1024</value>
          <discription>The default:8182MB</discription>
    </property>
    <property>
          <name>yarn.nodemanager.vmem-pmem-ratio</name>
          <value>2.1</value>
    </property>
    <property>
          <name>yarn.nodemanager.resource.memory-mb</name>
          <value>1024</value>
   </property>
   <property>
         <name>yarn.nodemanager.vmem-check-enabled</name>
         <value>false</value>
   </property>
</configuration>

7.将hadoop和java加入环境变量

[hadoop@hserver1 hadoop-2.9.2]# vim ~/.bashrc
export JDK_ROOT=/home/hadoop/jdk1.8.0_171
export J2SDKDIR=${JDK_ROOT}
export J2REDIR=${JDK_ROOT}/jre
export JAVA_HOME=${JDK_ROOT}
export DERBY_HOME=${JDK_ROOT}/db

export HADOOP_HOME=/home/hadoop/hadoop-2.9.2
export PATH=${JDK_ROOT}/bin:${JDK_ROOT}/jre/bin:${JDK_ROOT}/db/bin:$PATH
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#export MANPATH=${JDK_ROOT}/opt/sun-java8/man:$MANPATH


PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

为了保证环境变量可用,需要source ~/.bashrc  (确保是hadoop用户)

8.hadoop的namenode格式化

[hadoop@hserver1 hadoop-2.9.2]# hadoop namenode -format
....
19/01/18 10:46:35 INFO namenode.FSImage: Allocated new BlockPoolId: BP-2072119921-10.58.107.38-1547779595732
19/01/18 10:46:35 INFO common.Storage: Storage directory /home/work/hadoop/dfs/name has been successfully formatted.
19/01/18 10:46:35 INFO namenode.FSImageFormatProtobuf: Saving image file /home/work/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
19/01/18 10:46:35 INFO namenode.FSImageFormatProtobuf: Image file /home/work/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds .
19/01/18 10:46:35 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
19/01/18 10:46:35 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at cq02-bda-advsvr06-02.cq02.baidu.com/10.58.107.38
************************************************************/
....

9.启动hadoop集群

[hadoop@hserver1 hadoop-2.9.2]$ start-all.sh 

10.终端验证

[hadoop@hserver1 ~]# jps

120432 NodeManager
119714 DataNode
122194 Jps
119593 NameNode
120299 ResourceManager

验证时每台机器对应验证,每台机器对应机器进行对照,例如NameNode仅仅在hserver1才会有,而DataNode 我这里每台机器都有,因为slaves我全部都加了DataNode

11.前端验证

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值