Yarn伪分布式
根据上章
Hadoop本地运行模式
步骤:
伪分布模式就是在本地模式上修改配置文件:
core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml
1、ssh的免密登陆
[root@flink102 hadoop]# ssh-keygen -t rsa
进入/root/.ssh下:将id_rsa.pub的信息发在authorized_keys下:
[root@flink102 hadoop]# cd /root/.ssh/
[root@flink102 .ssh]# cat id_rsa.pub >> authorized_keys
[root@flink102 .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts
2、关于配置文件的主要内容
core-site.xml:指定hadoop的主节点master
hdfs-site.xml:指定hadoop中的文件副本数
mapred-site.xml:指定mapreduce的资源管理
首先,对yarn-env.sh配置文件参数进行配置,该文件在hadoop安装目录下的etc/hadoop下:
root@localhost hadoop]# ls
capacity-scheduler.xml hadoop-env.sh httpfs-env.sh kms-env.sh mapred-env.sh ssl-server.xml.example
configuration.xsl hadoop-metrics2.properties httpfs-log4j.properties kms-log4j.properties mapred-queues.xml.template yarn-env.cmd
container-executor.cfg hadoop-metrics.properties httpfs-signature.secret kms-site.xml mapred-site.xml.template yarn-env.sh
core-site.xml hadoop-policy.xml httpfs-site.xml log4j.properties slaves yarn-site.xml
hadoop-env.cmd hdfs-site.xml kms-acls.xml mapred-env.cmd ssl-client.xml.example
[root@localhost hadoop]#
[root@localhost hadoop]# vim yarn-env.sh
# some Java parameters
export JAVA_HOME=/usr/local/java/jdk/jdk1.8.0_221
配置core-site.xml
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://flink102:9000</value>
</property>
<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/module/hadoop-2.7.2/data</value>
</property>
配置yarn-site.xml
[root@localhost hadoop]# vim yarn-site.xml
<!-- Site specific YARN configuration properties -->
<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.219.7</value>
</property>
配置hdfs-site.xml
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.http.address</name>
<value>flink102:50090</value>
3、配置:mapred-env.sh
[root@localhost hadoop]# vim mapred-env.sh
4、配置mapred-site.xml.template(对mapred-site.xml.template重新命名为) mapred-site.xml)
[root@localhost hadoop]# cp mapred-site.xml.template mapred-site.xml
[root@localhost hadoop]#
[root@localhost hadoop]# vim mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>flink102:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>flink102:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/history/done</value>
</property>
<property>
<name>mapreudce.jobhistory.intermediate.done-dir</name>
<value>/history/done/done_intermediate</value>
</property>
配置完毕后,启动集群
注意:启动前必须保证NameNode和DataNode已经启动
启动ResourceManager:
[root@localhost hadoop-2.7.2]# sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-MissZhou-resourcemanager-localhost.localdomain.out
[root@localhost hadoop-2.7.2]#
启动NodeManager
[root@localhost hadoop-2.7.2]# sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-MissZhou-nodemanager-localhost.localdomain.out
[root@localhost hadoop-2.7.2]#
访问:
虚拟机主机ip地址+8088
若无法访问,参考:
解决无法访问50070、80080问题
1、上传hfds文件目录:
[root@localhost hadoop-2.7.2]# hadoop fs -put wcinput /
[root@localhost hadoop-2.7.2]#
2、查看是否有上传,访问50070的web页面:
启动发生错误:
[root@localhost hadoop-2.7.2]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /usr/misszhou/output/ /usr/misszhou/output/
19/11/15 23:20:32 INFO client.RMProxy: Connecting to ResourceManager at hadoop100/192.168.219.100:8032
19/11/15 23:20:37 INFO ipc.Client: Retrying connect to server: hadoop100/192.168.219.100:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/11/15 23:20:40 INFO ipc.Client: Retrying connect to server: hadoop100/192.168.219.100:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/11/15 23:20:43 INFO ipc.Client: Retrying connect to server: hadoop100/192.168.219.100:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
19/11/15 23:20:46 INFO ipc.Client: Retrying connect to server: hadoop100/192.168.219.100:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
分析原因:没有启动hadoop节点造成的
解决措施:hadoop bin跑start-all.sh
[root@localhost bin]# start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [192.168.219.7]
root@192.168.219.7's password:
192.168.219.7: starting namenode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-namenode-localhost.localdomain.out
root@localhost's password:
localhost: starting datanode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-root-datanode-localhost.localdomain.out
Starting secondary namenodes [0.0.0.0]
root@0.0.0.0's password:
0.0.0.0: secondarynamenode running as process 10292. Stop it first.
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-resourcemanager-localhost.localdomain.out
root@localhost's password:
localhost: starting nodemanager, logging to /opt/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-localhost.localdomain.out
[root@localhost bin]#