Centos7环境下部署Hadoop Cluster实践

一、规划

3个节点,其中一个做master(namenode+namesecondary),另外2个做slave(datanode)。
master node配置:
IP:192.168.253.137
mask:255.255.255.0
gateway:192.168.253.1
2CPU+2GBRAM+20GB Disk
slave node 1配置:
IP:192.168.253.139
mask:255.255.255.0
gateway:192.168.253.1
2CPU+2GBRAM+20GB Disk
slave node 2配置:
IP:192.168.253.138
mask:255.255.255.0
gateway:192.168.253.1
2CPU+2GBRAM+20GB Disk

二、安装Hadoop Cluster的master(namenode)节点

2.1 部署 VMware Workstation虚机
2.2 在虚机上部署CentOS7 Minimal
2.3 安装Hadoop 2.9.1依赖的环境
2.3.1 安装jdk 1.7
yum install java-1.7.0-openjdk.x86_64 -y
yum install java-1.7.0-openjdk-devel.x86_64 -y
2.3.2 设置JDK环境
在/etc/profile末尾增加:
export JAVA_ HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.181-2.6.14.8.el7_5.x86_64
export JRE_HOME=$JAVA_HOME/jre
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
2.3.3 设置hosts文件
192.168.253.137 master
192.168.253.139 slave1
192.168.253.138 slave2
设置/etc/hostname
master
2.3.4 设置安装和运行Hadoop的账号
user add hadoop
设置hadoop用户的.bashrc
export HADOOP_PREFIX=/usr/local/hadoop-2.9.1
export HADOOP_COMMON=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
2.4 安装Hadoop 2.9.1
以下步骤均以hadoop用户身份运行
2.4.1 从apache官网或镜像站点下载hadoop 2.9.1二进制安装包
2.4.2 安装hadoop 2.9.1
tar -xvf hadoop-2.9.1.tar.gz -C /usr/local/
2.4.3 设置hadoop配置文件
创建/data
mkdir /data
chown -R hadoop:hadoop /data
2.4.3.0 设置hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.181-2.6.14.8.el7_5.x86_64
export HADOOP_LOG_DIR=/data/hadoop/logs/$USER
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
2.4.3.1 设置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/local/tmp/hadoop-${user.name}</value>
</property>
</configuration>
2.4.3.2 设置hdfs-site.xml
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///data/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file:///data/hadoop/hdfs/namesecondary</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
</configuration>
2.4.3.3 设置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
2.4.3.4 设置yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/data/hadoop/yarn/local</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/data/hadoop/yarn/tmp/logs</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://master:19888/jobhistory/logs</value>
<description>URL for job history server</description>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

三、部署Hadoop slaves(datanode)节点
3.1 克隆master节点
克隆master虚机到slave1、slave2.
3.2 修改slave1、slave2的相关设置
3.2.1 修改/etc/hostname
分别修改为slave1、slave2。
3.2.2 修改网络设置
设置IP为192.168.253.138、192.168.253.139
3.3 建立从master到slave1及slave2的无认证登陆
3.3.1 产生master几点的密钥对
以hadoop用户登陆master,运行:
ssh-keygen -t rsa
3.3.2 拷贝master的rsa 公钥到slaves
ssh-copy-id -i ~/.ssh/id_rsa.pub slave1
ssh-copy-id -i ~/.ssh/id_rsa.pub slave2
四、运行Hadoop Cluster
在master节点,以hadoop身份运行:
4.1 格式化hdfs
hdfs namenode -format
4.2 启动hadoop
start-dfs.sh
4.3 启动yarn
start-yarn.sh
4.4 启动history server
mr-jobhistory-daemon.sh start historyserver
4.5 验证
在master几点,运行:
jps
结果:
[hadoop@master hadoop]$ jps
13188 SecondaryNameNode
13720 Jps
13668 JobHistoryServer
13363 ResourceManager
12979 NameNode
在slaves节点,运行:
jps
结果:
[hadoop@slave1 hadoop]$ jps
11907 NodeManager
12265 Jps
11711 DataNode
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值