上节完成了完全分布式配置–分发脚本
接下来我们需要做的就是
完全分布式配置–快速分发Hadoop
首先,我们需要把xsync文件拷贝到bin目录下
我们就可以hadoop目录下执行:所有用户都能执行xsync
[root@hadoop101 MissZhou]# sudo cp xsync /bin/
[root@hadoop101 MissZhou]#
[root@hadoop101 hadoop-2.7.2]# xsync
no args...
[root@hadoop101 hadoop-2.7.2]#
接着,我们需要把它发送给hadoop100
使用rsync -av
[root@hadoop101 hadoop-2.7.2]# rsync -av /bin/xsync hadoop100:/bin/
root@hadoop100's password:
sending incremental file list
xsync
sent 590 bytes received 35 bytes 178.57 bytes/sec
total size is 499 speedup is 0.80
[root@hadoop101 hadoop-2.7.2]#
当发送成功后,hadoop100也可以执行
[root@hadoop100 hadoop]# xsync
no args...
[root@hadoop100 hadoop]#
接下,由于我们现在搭建的完全分布式,所以在以往的伪分布式需要改变一下:
首先,我们先进入hadoop配置文件:core-site.xml
[root@hadoop100 hadoop]# rm -rf /opt/hadoop/module/hadoop-2.7.2/tmp/
[root@hadoop100 hadoop]#
可以在hadoop安装目录下查看是否含有tmp文件,若没有就不是已经删除了,如执行查询:
[root@hadoop100 hadoop-2.7.2]# ls
bin data etc include input lib libexec LICENSE.txt logs NOTICE.txt output README.txt sbin share wcinput wc.input wcountput workcount
[root@hadoop100 hadoop-2.7.2]#
接下来,需要配置: hdfs-site.xml
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- 指定Hadoop辅助名称节点主机配置 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop102:50090</value>
</property>
接着,配置:yarn-site.xml
<!-- Site specific YARN configuration properties -->
<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop101</value>
</property>
修改:mapred-site.xml
<!-- 指定MR运行在YARN上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
最后,修改 slaves
192.168.219.9 hadoop103
192.168.219.8 hadoop102
192.168.219.7 hadoop101
接着,在/etc/hosts配置:
192.168.219.7 hadoop100
192.168.219.9 hadoop102
192.168.219.8 hadoop101
接下来,操作免密登陆(没有免密登陆就是启动不了)
参照:hadoop集群的免密登陆设置
修改slave文件
192.168.219.9 hadoop103
192.168.219.8 hadoop102
192.168.219.7 hadoop101
将所有的配置文件拷贝到另一台机器(Hadoop101)上
scp -r /opt/hadoop/module/hadoop-2.7.2/etc/hadoop/* root@hadoop101:/opt/hadoop/module/hadoop-2.7.2/etc/hadoop
[root@hadoop100 hadoop]# scp -r /opt/hadoop/module/hadoop-2.7.2/etc/hadoop/* root@hadoop101:/opt/hadoop/module/hadoop-2.7.2/etc/hadoop
capacity-scheduler.xml 100% 4436 802.9KB/s 00:00
configuration.xsl 100% 1335 334.0KB/s 00:00
container-executor.cfg 100% 318 82.5KB/s 00:00
core-site.xml 100% 1068 391.3KB/s 00:00
hadoop-env.cmd 100% 3670 1.3MB/s 00:00
hadoop-env.sh 100% 4245 1.0MB/s 00:00
hadoop-metrics2.properties 100% 2598 477.3KB/s 00:00
hadoop-metrics.properties 100% 2490 646.1KB/s 00:00
hadoop-policy.xml 100% 9683 1.1MB/s 00:00
hdfs-site.xml 100% 1171 585.9KB/s 00:00
httpfs-env.sh 100% 1449 600.1KB/s 00:00
httpfs-log4j.properties 100% 1657 512.7KB/s 00:00
httpfs-signature.secret 100% 21 4.6KB/s 00:00
httpfs-site.xml 100% 620 180.4KB/s 00:00
kms-acls.xml 100% 3518 524.0KB/s 00:00
kms-env.sh 100% 1527 955.2KB/s 00:00
kms-log4j.properties 100% 1631 404.6KB/s 00:00
kms-site.xml 100% 5511 1.1MB/s 00:00
log4j.properties 100% 11KB 2.8MB/s 00:00
mapred-env.cmd 100% 951 194.2KB/s 00:00
mapred-env.sh 100% 1389 206.5KB/s 00:00
mapred-queues.xml.template 100% 4113 1.4MB/s 00:00
mapred-site.xml 100% 1111 544.7KB/s 00:00
mapred-site.xml.template 100% 758 215.6KB/s 00:00
slaves 100% 32 15.3KB/s 00:00
ssl-client.xml.example 100% 2316 464.8KB/s 00:00
ssl-server.xml.example 100% 2268 892.5KB/s 00:00
yarn-env.cmd 100% 2250 795.4KB/s 00:00
yarn-env.sh 100% 4573 734.4KB/s 00:00
yarn-site.xml 100% 833 217.6KB/s 00:00
[root@hadoop100 hadoop]#
接下来,就是重新格式化:
注意:
在一个hadoop集群内,仅需要执行一次,否着会发生不可预知的问题,极可能导致集群数据的丢失
[root@hadoop100 hadoop]# hadoop namenode -format
最后,启动集群
[root@hadoop100 hadoop-2.7.2]# sbin/start-dfs.sh
Starting namenodes on [hadoop100]
hadoop100: starting namenode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-namenode-hadoop100.out
hadoop103: ssh: Could not resolve hostname hadoop103: Name or service not known
hadoop101: starting datanode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-datanode-hadoop101.out
hadoop102: starting datanode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-datanode-hadoop102.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-secondarynamenode-hadoop100.out
[root@hadoop100 hadoop-2.7.2]# netstat -tpnl | grep java
tcp 0 0 192.168.219.7:9000 0.0.0.0:* LISTEN 10398/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 8553/java
tcp 0 0 127.0.0.1:46549 0.0.0.0:* LISTEN 10532/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 10398/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 10532/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 10532/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 10532/java
[root@hadoop100 hadoop-2.7.2]#
[root@hadoop100 hadoop-2.7.2]# systemctl stop firewalld.service
[root@hadoop100 hadoop-2.7.2]#
访问:虚拟机+50070
没有8088的监听,就无法访问8088,
分析原因,ResourceManager进程没有启动
解决方法
修改配置文件—yarn-site.xml
<!-- Site specific YARN configuration properties -->
<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop101</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.219.7:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.219.7:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.219.7:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>192.168.219.7:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.219.7:8088</value>
</property>
重新启动:start-yarn.sh
[root@hadoop100 hadoop-2.7.2]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-resourcemanager-hadoop100.out
hadoop103: ssh: Could not resolve hostname hadoop103: Name or service not known
192.168.219.8: starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop101.out
hadoop101: nodemanager running as process 11044. Stop it first.
hadoop102: mv: cannot stat ‘/opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out.4’: No such file or directory
hadoop102: mv: cannot stat ‘/opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out.3’: No such file or directory
hadoop102: mv: cannot stat ‘/opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out.2’: No such file or directory
hadoop102: mv: cannot stat ‘/opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out.1’: No such file or directory
192.168.219.9: starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out
hadoop102: starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out
hadoop102: mv: cannot stat ‘/opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop102.out’: No such file or directory
192.168.219.7: starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop100.out
hadoop102: ulimit -a
hadoop102: core file size (blocks, -c) 0
hadoop102: data seg size (kbytes, -d) unlimited
hadoop102: scheduling priority (-e) 0
hadoop102: file size (blocks, -f) unlimited
hadoop102: pending signals (-i) 3795
hadoop102: max locked memory (kbytes, -l) 64
hadoop102: max memory size (kbytes, -m) unlimited
hadoop102: open files (-n) 1024
hadoop102: pipe size (512 bytes, -p) 8
监听端口:netstat -tpnl | grep java
tcp 0 0 192.168.219.7:9000 0.0.0.0:* LISTEN 7629/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 7629/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 7765/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 7765/java
tcp 0 0 127.0.0.1:35293 0.0.0.0:* LISTEN 7765/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 7765/java
tcp6 0 0 :::8040 :::* LISTEN 8488/java
tcp6 0 0 :::8042 :::* LISTEN 8488/java
tcp6 0 0 192.168.219.7:8088 :::* LISTEN 8373/java
tcp6 0 0 :::13562 :::* LISTEN 8488/java
tcp6 0 0 192.168.219.7:8030 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8031 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8032 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8033 :::* LISTEN 8373/java
tcp6 0 0 :::46564 :::* LISTEN 8488/java
访问8088
做集群分发
[root@hadoop100 hadoop]# scp -r /opt/hadoop/module/hadoop-2.7.2/etc/hadoop/* root@hadoop101:/opt/hadoop/module/hadoop-2.7.2/etc/hadoop/
[root@hadoop100 hadoop]# scp -r /opt/hadoop/module/hadoop-2.7.2/etc/hadoop/* root@hadoop102:/opt/hadoop/module/hadoop-2.7.2/etc/hadoop/
再次启动
[root@hadoop100 hadoop-2.7.2]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [hadoop100]
..........................
监听端口:netstat -tpnl | grep java
[root@hadoop100 hadoop-2.7.2]# netstat -tpnl | grep java
tcp 0 0 192.168.219.7:9000 0.0.0.0:* LISTEN 7629/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 7629/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 7765/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 7765/java
tcp 0 0 127.0.0.1:35293 0.0.0.0:* LISTEN 7765/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 7765/java
tcp6 0 0 :::8040 :::* LISTEN 8488/java
tcp6 0 0 :::8042 :::* LISTEN 8488/java
tcp6 0 0 192.168.219.7:8088 :::* LISTEN 8373/java
tcp6 0 0 :::13562 :::* LISTEN 8488/java
tcp6 0 0 192.168.219.7:8030 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8031 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8032 :::* LISTEN 8373/java
tcp6 0 0 192.168.219.7:8033 :::* LISTEN 8373/java
tcp6 0 0 :::46564 :::* LISTEN 8488/java
访问