整体流程
模拟集群中有三个slaves时
1.修改虚拟机网络连接方式 主机名 IP 映射文件 关闭防火墙
2.安装ssh客户端方便多台虚拟机传输数据
3.安装jdk8
4.安装hadoop安装包 配置环境变量
安装准备文件
先将虚拟机的网络模式选为NAT
修改主机名
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hdp01 ##配置主机名
修改虚拟机IP
vi /etc/sysconfig/network-scripts/ifcfg-eth0
##这里ONBOOT设置为自动启动yes mac地址和uuid去掉 BOOTPROTO设置为static的,然后自己配置IP地址
DEVICE="eth0"
BOOTPROTO="static"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
IPADDR=192.168.1.101
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
DNS1=114.114.114.114
DNS2=114.114.115.115
配置域名映射
vi /etc/hosts
#添加IP映射
192.168.88.50 hdp01
192.168.88.51 hdp02
192.168.88.52 hdp03
关闭防火墙以及禁止防火墙自启
#查看防火墙状态
service iptables status
#关闭防火墙
service iptables stop
#查看防火墙开机启动状态
chkconfig iptables --list
#关闭防火墙开机启动
chkconfig iptables off
安装ssh客户端
yum install -y openssh-clients
重启电脑,让配置生效
reboot
安装rz
yum install -y lrzsz
安装Hadoop
下载http://archive.apache.org/dist/hadoop/core/,解压
进入hadoop安装包的etc/hadoop目录,修改配置文件
##修改hadoop-env.sh,配置java jdk路径,大概在27行配置,如下:
export JAVA_HOME=/usr/local/java/jdk1.8 ##配置本机java_home
##修改core-site.xml,配置内容如下
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdp01:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储路径 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/cloud/hadoop/tmp</value>
</property>
##修改hdfs-site.xml,修改配置如下
<!-- 设置hdfs副本数量 -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 设置seconderyNameNode -->
<property>
<name>dfs.secondary.http.address</name>
<value>hdp02:50090</value>
</property>
##修改mapred-site.xml 由于在配置文件目录下没有,需要修改名称:mv mapred-site.xml.template mapred-site.xml
<!-- 通知框架MR使用YARN -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
##修改yarn-site.xml,修改内容如下
<!-- reducer取数据的方式是mapreduce_shuffle -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hdp01</value>
</property>
##删除slaves原有的内容,配置如下内容
hdp01
hdp02
hdp03
更新环境变量
source /etc/profile
克隆虚拟机
因为是克隆的虚拟机,所以虚拟机上的主机名称,网卡ip mac地址都没改变,且克隆后的虚拟机会出现网卡异常
修改主机名
vi /etc/sysconfig/network
修改虚拟机ip,记得去掉uuid和mac地址避免和被克隆的机器冲突
vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="static"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
IPADDR=192.168.1.102
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
DNS1=114.114.114.114
DNS2=114.114.115.115
删除网卡,这样系统会自己去重置配置
rm -rf /etc/udev/rules.d/70-persistent-net.rules
重启电脑并更新网络服务
reboot
service network restart
在主机上设置免密登录
ssh-keygen -t rsa 声明密钥,这样在用户的.ssh文件夹下会产生两个文件
ssh-copy-id hdp01 为某个机器配置免密登录,如果配置好域名映射,后面写映射的名称即可
[root@hdpvm1 ~]# cd ~/.ssh
[root@hdpvm1 .ssh]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
33:8c:61:a5:eb:25:d5:7a:33:51:f9:4e:9b:18:36:e3 root@hdpvm1
The key's randomart image is:
+--[ RSA 2048]----+
| . .. |
| o . .. |
| + . o . |
| . * . .= o |
| + S +o B o |
| . o + oE + |
| . |
| |
| |
+-----------------+
[root@hdpvm1 .ssh]# ssh-copy-id hdp01
The authenticity of host 'hdpvm1 (192.168.254.105)' can't be established.
RSA key fingerprint is f8:98:34:82:c5:8b:14:fd:1a:26:42:28:1b:44:d3:3e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hdpvm1,192.168.254.105' (RSA) to the list of
known hosts.
root@hdpvm1's password:
Now try logging into the machine, with "ssh 'hdpvm1'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[root@hdpvm1 .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts
启动集群
#先格式化
#hdfs namenode -format
#start-all.sh/stop-all.sh
上面的命令是启动hdfs和mapreduce的,现在建议使用如下两个命令代替
start-dfs.sh/stop-dfs.sh
start-yarn.sh/stop-yarn.sh
使用jps命令查询正在运行的java节点
4688 NodeManager
4454 SecondaryNameNode
5005 Jps
4192 NameNode
4308 DataNode
4594 ResourceManager
集群启动成功后,我们可以通过50070端口网站访问该集群,50070为HDFS管理界面,8088为MR管理界面