Hadoop集群搭建

Hadoop集群部署

1.Hadoop集群规划

在这里插入图片描述

节点配置ip映射:

node1, 修改/etc/hosts:

10.0.194.30     node1
10.0.195.109    node2
10.0.194.59     node3
10.0.194.30    localhost

node2, 修改/etc/hosts:

10.0.194.30     node1
10.0.195.109    node2
10.0.194.59     node3
10.0.195.109    localhost

node3, 修改/etc/hosts:

10.0.194.30     node1
10.0.195.109    node2
10.0.194.59     node3
10.0.194.59    localhost

Hadoop-cdh, hive 软件压缩包和jdk 压缩包网盘地址:

链接:https://pan.baidu.com/s/1zm6ur2-aq4hSNVwuDqsXbA
提取码:1234

2.前置安装

配置主节点与副节点间SSH免密登录

  1. node1, node2, node3 分别生成ssh秘钥:ssh-keygen -t rsa

  2. 因为主节点为node1, 为了保证node1可以免密登录node1, node2, node3所以将node1的公钥分别拷贝至node1, node2, node3(在node1上执行):

    ssh-copy-id -i ~/.ssh/id_rsa.pub node1
    ssh-copy-id -i ~/.ssh/id_rsa.pub node2
    ssh-copy-id -i ~/.ssh/id_rsa.pub node3
    

    注意:如果前面没有配置好ip映射,这里是无法识别node主机名的,第一次复制过去的时候需要输入密码,后续登录可以免密

3.JDK安装

  1. node1安装JDK1.8:

    • 解压缩jdk1.8.tar.gz至指定目录(我安装至/usr/local下)

    • 配置JDK环境变量,ubuntu 修改~/.bashrc

      export JAVA_HOME=/usr/local/jdk1.8.0_291
      export JRE_HOME=${JAVA_HOME}/jre  
      export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
      export PATH=${JAVA_HOME}/bin:$PATH
      

      source ~/.bashrc

      sudo update-alternatives --install /usr/bin/java java /usr/local/jdk1.8.0_291/bin/java 3000
      
      sudo update-alternatives --install /usr/bin/javac javac /usr/local/jdk1.8.0_291/bin/javac 3000
      
      sudo update-alternatives --install /usr/bin/jar jar /usr/local/jdk1.8.0_291/bin/jar 3000
      
      sudo update-alternatives --install /usr/bin/javah javah /usr/local/jdk1.8.0_291/bin/javah 3000
      
      sudo update-alternatives --install /usr/bin/javap javap /usr/local/jdk1.8.0_291/bin/javap 3000
      
      sudo update-alternatives --install /usr/bin/jshell jshell /usr/local/jdk1.8.0_291/bin/jshell 3000
      
      sudo update-alternatives --install /usr/bin/jconsole jconsole /usr/local/jdk1.8.0_291/bin/jconsole 3000
      
  2. 将node1解压后的目录scp至node2和node3, 并配置node2和node3的环境变量:

    • 在node1上执行scp -r jdk1.8.0_291 root@node2:/usr/local/, scp -r jdk1.8.0_291 root@node3:/usr/local/

    • 配置node2 JDK环境变量,修改~/.bashrc

      export JAVA_HOME=/usr/local/jdk1.8.0_291
      export JRE_HOME=${JAVA_HOME}/jre  
      export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
      export PATH=${JAVA_HOME}/bin:$PATH
      

      source ~/.bashrc

      sudo update-alternatives --install /usr/bin/java java /usr/local/jdk1.8.0_291/bin/java 3000
      
      sudo update-alternatives --install /usr/bin/javac javac /usr/local/jdk1.8.0_291/bin/javac 3000
      
      sudo update-alternatives --install /usr/bin/jar jar /usr/local/jdk1.8.0_291/bin/jar 3000
      
      sudo update-alternatives --install /usr/bin/javah javah /usr/local/jdk1.8.0_291/bin/javah 3000
      
      sudo update-alternatives --install /usr/bin/javap javap /usr/local/jdk1.8.0_291/bin/javap 3000
      
      sudo update-alternatives --install /usr/bin/jshell jshell /usr/local/jdk1.8.0_291/bin/jshell 3000
      
      sudo update-alternatives --install /usr/bin/jconsole jconsole /usr/local/jdk1.8.0_291/bin/jconsole 3000
      
    • 配置node3 JDK环境变量, 修改~/.bashrc

      export JAVA_HOME=/usr/local/jdk1.8.0_291  
      export JRE_HOME=${JAVA_HOME}/jre  
      export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
      export PATH=${JAVA_HOME}/bin:$PATH
      

      source ~/.bashrc

      sudo update-alternatives --install /usr/bin/java java /usr/local/jdk1.8.0_291/bin/java 3000
      
      sudo update-alternatives --install /usr/bin/javac javac /usr/local/jdk1.8.0_291/bin/javac 3000
      
      sudo update-alternatives --install /usr/bin/jar jar /usr/local/jdk1.8.0_291/bin/jar 3000
      
      sudo update-alternatives --install /usr/bin/javah javah /usr/local/jdk1.8.0_291/bin/javah 3000
      
      sudo update-alternatives --install /usr/bin/javap javap /usr/local/jdk1.8.0_291/bin/javap 3000
      
      sudo update-alternatives --install /usr/bin/jshell jshell /usr/local/jdk1.8.0_291/bin/jshell 3000
      
      sudo update-alternatives --install /usr/bin/jconsole jconsole /usr/local/jdk1.8.0_291/bin/jconsole 3000
      

4Hadoop集群部署

主节点node1配置

  1. 解压hadoop-cdh 压缩包至指定目录/usr/local下,添加环境变量:

    export HADOOP_HOME=/usr/local/hadoop-2.6.0-cdh5.15.1
    export PATH=$HADOOP_HOME/bin:$PATH
    

    source ~/.bashrc

  2. 修改hadoop 配置, 配置文件目录/usr/local/hadoop-2.6.0-cdh5.15.1/etc/hadoop

    • 配置hadoop-env.sh:

      export JAVA_HOME=/usr/local/jdk1.8.0_291
      
    • 配置core-site.xml:

      <configuration>
          <property>
              <name>fs.defaultFS</name>
              <value>hdfs://node1:8020</value>
          </property>
      </configuration>
      
    • 配置hdfs-site.xml:

      <configuration>
          <property>
            <name>dfs.namenode.name.dir</name>
            <value>/home/hadoop/app/tmp/dfs/name</value>
          </property>
      
          <property>
            <name>dfs.datanode.data.dir</name>
            <value>/home/hadoop/app/tmp/dfs/data</value>
          </property>
      </configuration>
      
    • 配置yarn-site.xml:

      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
       </property>
      
      <property>
          <name>yarn.resourcemanager.hostname</name>
          <value>node1</value>
      </property>
      
    • 配置mapred-site.xml, 在配置目录下没有mapred-site.xml, 有一个mapred-site.xml.template,重命名为mapred-site.xml就可以了:

      <property>
      	<name>mapreduce.framework.name</name>
      	<value>yarn</value>
      </property>
      
    • 配置slaves

      node1
      node2
      node3
      

分发主节点hadoop到其他机器

  1. 拷贝hadoop解压缩后的目录至node2与node3, 相同目录下:

    scp -r /usr/local/hadoop-2.6.0-cdh5.15.1 root@node2:/usr/local/
    scp -r /usr/local/hadoop-2.6.0-cdh5.15.1 root@node3:/usr/local/
    
  2. 配置node2与node3 hadoop环境变量:

    export HADOOP_HOME=/usr/local/hadoop-2.6.0-cdh5.15.1
    export PATH=$HADOOP_HOME/bin:$PATH
    
    source ~/.bashrc
    

注意:如果不是通过scp分发的模式配置hadoop,而是通过在对应节点主机上解压缩的方式配置, 配置内容和上边node1的需要一样

需要将从节点的core-site.xml修改为主节点的urI, 否则在启动DN时会报错:无法来连接服务,因为NN是在主节点启动的

同时也需要将从节点的yarn-site.xml修改为主节点node1

NameNode格式化

  • 在主节点node1上执行命令hadoop namenode -format

  • 如果以前格式化过,需要先将/home/hadoop/app/tmp/dfs/name/home/hadoop/app/tmp/dfs/namesecondary以及/home/hadoop/app/tmp/dfs/data目录下的current文件夹删除。

    上边这些目录是在hdfs-site.xml中配置的,如果重新format不删除current的话,可能会导致节点启动失败。

  • 当在最后出现这一行信息表示格式化成功:

    21/06/30 00:45:54 INFO common.Storage: Storage directory /home/hadoop/app/tmp/dfs/name has been successfully formatted.
    

启动HDFS

  • 在主节点node1上执行/usr/local/hadoop-2.6.0-cdh5.15.1/sbin/start-dfs.sh:

    root@node1:/usr/local/hadoop-2.6.0-cdh5.15.1/sbin# /usr/local/hadoop-2.6.0-cdh5.15.1/sbin/start-dfs.sh 
    21/06/30 00:48:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [node1]
    node1: starting namenode, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/hadoop-root-namenode-node1.out
    node1: starting datanode, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/hadoop-root-datanode-node1.out
    node2: starting datanode, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/hadoop-root-datanode-node2.out
    node3: starting datanode, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/hadoop-root-datanode-node3.out
    Starting secondary namenodes [0.0.0.0]
    0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/hadoop-root-secondarynamenode-node1.out
    21/06/30 00:48:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    
  • 查看node1是否启动成功:

    root@node1:/usr/local/hadoop-2.6.0-cdh5.15.1/sbin# jps
    1041 SecondaryNameNode
    593 NameNode
    794 DataNode
    1391 Jps
    

    查看node2的DN是否启动成功:

    root@node2:/usr/local# jps
    16258 DataNode
    17897 Jps
    

    查看node3的DN是否启动成功:

    root@node3:~# jps
    23409 Jps
    23109 DataNode
    
  • 查看web 界面:

在这里插入图片描述

启动YARN

root@node1:/usr/local/hadoop-2.6.0-cdh5.15.1/sbin# ./start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/yarn-root-resourcemanager-node1.out
node3: starting nodemanager, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/yarn-root-nodemanager-node3.out
node2: starting nodemanager, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/yarn-root-nodemanager-node2.out
node1: starting nodemanager, logging to /usr/local/hadoop-2.6.0-cdh5.15.1/logs/yarn-root-nodemanager-node1.out
  • 查看node1的RM和DM是否启动:

    root@node1:/usr/local/hadoop-2.6.0-cdh5.15.1/sbin# jps
    29921 NameNode
    30370 SecondaryNameNode
    5957 NodeManager
    6571 Jps
    5613 ResourceManager
    30126 DataNode
    

    查看node2的NM:

    root@node2:~# jps
    5283 Jps
    25913 DataNode
    3660 NodeManager
    

    查看node3的NM:

    root@node3:~# jps
    31572 DataNode
    6567 NodeManager
    8382 Jp
    
  • 查看yarn界面:

在这里插入图片描述

5.作业提交到Hadoop集群运行

  1. 到hadoop自带的demo目录/usr/local/hadoop-2.6.0-cdh5.15.1/share/hadoop/mapreduce下运行一个jar包:

    root@node2:/usr/local/hadoop-2.6.0-cdh5.15.1/share/hadoop/mapreduce# hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.15.1.jar pi 2 3
    
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

一切如来心秘密

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值