搭建hadoop分布式集群以及大数据开发环境(配置hdfs,yarn,mapreduce等)

一、hadoop集群

1.节点

master:

master1:  ip:192.168.75.137

master2:  ip:192.168.75.138

slave:

slave1:  ip:192.168.75.139

slave2:  ip:192.168.75.140

操作:

(1)查看ip

ifconfig

(2)更改hostname主机名

hostnamectl set-hostname 主机名

(3)添加域名映射

vim /etc/hosts

(4)查看是否存在.ssh

[root@master1 ~]# ls -a

如果有则输入:rm -rf /root/.ssh卸载
(5)生成ssh

 ssh-keygen -t rsa

(6)给钥匙

master上执行

scp id_rsa.pub root@master1:/root/

scp id_rsa.pub root@slave1:/root/

等等等

(7)加保险

master和slave上都需要

cat id_rsa.pub>>.ssh/authorized_keys
(8)测试

[root@master1 ~]# ssh slave1
Last login: Tue Jul 17 09:52:38 2018 from 192.168.75.1
[root@slave1 ~]# 

 

2.配置java环境变量

(1)vim /etc/profile

末尾添加:

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el7_4.x86_64/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
(2)执行source /etc/profile生效

(3)查看

$PATH

 

3.集群搭建

(1)配置文件路径:

配置集群的配置文件目录:cd $HADOOP_HOME/etc/hadoop/conf
(2)增加slave节点

[root@master1 conf]# vim /etc/hadoop/conf/slaves

添加

master1
master2
slave1
slave2
(3)配置集群core-site.xml

[root@master2 ~]# vim /etc/hadoop/conf/core-site.xml
添加:

<property>
                <name>hadoop.tmp.dir</name>
                <value>/usr/hdp/tmp</value>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://master1:8020</value>
        </property>
(4)配置集群hdfs-site.xml

[root@master1 conf]# vim hdfs-site.xml

添加:

<!--
       Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>4</value>(注意:不大于datanode节点的数量)
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/hadoop/hdfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/hadoop/hdfs/data</value>
        </property>
</configuration>
 

(5)创建hdfs需要用的文件目录

[root@master1 ~]# mkdir /usr/hdp/tmp -p
[root@master1 ~]# mkdir /hadoop/hdfs/{data,name} -p
[root@master1 ~]# chown -R hdfs:hadoop /hadoop
[root@master1 ~]# chown -R hdfs:hadoop /usr/hdp/tmp
 

(6)初始化hdfs文件系统

在master1上操作:

[root@master1 ~]# sudo -E -u hdfs hdfs namenode -format
 

(7)启动hdfs文件系统

启动master1节点上的服务:

[root@master1 ~]# systemctl start hadoop-hdfs-namenode
[root@master1 ~]# systemctl start hadoop-hdfs-datanode

启动master2节点上的服务:

[root@master2 ~]# systemctl start hadoop-hdfs-datanode
[root@master2 ~]# systemctl start hadoop-hdfs-secondarynamenode

启动slave1、slave2节点上的服务:

[root@slave1 ~]# systemctl start hadoop-hdfs-datanode
[root@slave2 ~]# systemctl start hadoop-hdfs-datanode

(8)使用jps命令查看

4.网址查看

192.168.75.137:50070

二、大数据开发环境

1.准备程序运行目录

[root@master1 ~]# su - hdfs

-bash-4.2$ hadoop fs -mkdir /tmp

-bash-4.2$ hadoop fs -chmod -R 1777 /tmp

-bash-4.2$ hadoop fs -mkdir -p /var/log/hadoop-yarn

-bash-4.2$ hadoop fs -chown yarn:mapred /var/log/hadoop-yarn

-bash-4.2$ hadoop fs -mkdir /user

-bash-4.2$ hadoop fs -mkdir /user/hadoop

-bash-4.2$ hadoop fs -mkdir /user/history

-bash-4.2$ hadoop fs -chmod 1777 /user/history

-bash-4.2$ hadoop fs -chown mapred:hadoop /user/history
 

2.配置yarn-site.xml

[root@master1 conf]# vim yarn-site.xml
添加:

<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master2</value>
        </property>

        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>

        <property>
                <name>yarn.nodemanager.local-dirs</name>
                <value>file:///hadoop/yarn/local</value>
        </property>

        <property>
                <name>yarn.nodemanager.log-dirs</name>
                <value>/var/log/hadoop-yarn/containers</value>
        </property>

        <property>
                <name>yarn.nodemanager.remote-app-log-dir</name>
                <value>/var/log/hadoop-yarn/apps</value>
        </property>

        <property>
                <name>yarn.log-aggregation-enable</name>
                <value>true</value>
        </property>


        <property>
                <name>yarn.scheduler.minimum-allocation-mb</name>
                <value>511</value>
        </property>

        <property>
                <name>yarn.scheduler.maximum-allocation-mb</name>
                <value>2049</value>
        </property>

        <property>
                <name>yarn.nodemanager.vmem-pmem-ratio</name>
                <value>4</value>
        </property>

        <property>
                <name>yarn.nodemanager.vmem-check-enabled</name>
                <value>false</value>
        </property>

       <property>
                <name>yarn.application.classpath</name>
                <value>$HADOOP_CONF_DIR,
                        /usr/hdp/2.6.3.0-235/hadoop/*,
                        /usr/hdp/2.6.3.0-235/hadoop/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-hdfs/*,
                        /usr/hdp/2.6.3.0-235/hadoop-hdfs/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-yarn/*,
                        /usr/hdp/2.6.3.0-235/hadoop-yarn/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-mapreduce/*,
                        /usr/hdp/2.6.3.0-235/hadoop-mapreduce/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-httpfs/*,
                        /usr/hdp/2.6.3.0-235/hadoop-httpfs/lib/*
                </value>
        </property>
</configuration>
 

 

3.配置mapred-site.xml

[root@master1 conf]# vim mapred-site.xml

添加:

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>

        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>slave1:10020</value>
        </property>

        <property>
                <name>mapreduce.jobhistory.webapps.address</name>
                <value>slave1:19888</value>
        </property>

        <property>
                <name>yarn.app.mapreduce.am.staging-dir</name>
                <value>/user</value>
        </property>

        <property>
                <name>mapreduce.application.classpath</name>
                <value>
                        /etc/hadoop/conf/*,
                        /usr/hdp/2.6.3.0-235/hadoop/*,
                        /usr/hdp/2.6.3.0-235/hadoop-hdfs/*,
                        /usr/hdp/2.6.3.0-235/hadoop-yarn/*,
                        /usr/hdp/2.6.3.0-235/hadoop-mapreduce/*,
                        /usr/hdp/2.6.3.0-235/hadoop/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-hdfs/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-yarn/lib/*,
                        /usr/hdp/2.6.3.0-235/hadoop-mapreduce/lib/*
                </value>
        </property>

       <property>
                <name>mapreduce.map.java.opts</name>
                <value>-Xmx1024M</value>
        </property>

        <property>
                <name>mapreduce.map.memory.mb</name>
                <value>31</value>
        </property>

        <property>
                <name>mapreduce.reduce.java.opts</name>
                <value>-Xmx1024M</value>
        </property>

        <property>
                <name>mapreduce.reduce.memory.mb</name>
                <value>63</value>
        </property>
 

</configuration>
 

4.配置yarn的本地目录

[root@master1 ~]# touch /etc/hadoop/conf/yarn-env.sh
[root@master1 ~]# mkdir -p /hadoop/yarn/local
[root@master1 ~]# chown yarn:yarn -R /hadoop/yarn/local
 

5.启动服务

在master2上开启resourcemanager:

[root@master2 ~]# systemctl start hadoop-yarn-resourcemanager

访问web后台master2:8088

在slave1、slave2上开启historyserver

[root@slave1 ~]# systemctl start hadoop-mapreduce-historyserver

[root@slave2 ~]# systemctl start hadoop-mapreduce-historyserver

在所有启动datanode的节点上开nodemanager

[root@slave2 ~]# systemctl start hadoop-yarn-nodemanager
 

6.验证

master2:8088

slave1:19888

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

WuGenQiang

谢谢你的喜欢

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值