yarn集群模式配置(一)

 

tar -zxvf hadoop-2.7.7.tar.gz

cd /home/spark/bigdata/hadoop-2.7.7/etc/hadoop

vi hadoop-env.sh

export JAVA_HOME=/opt/jdk1.8.0_201

 

cd /home/spark/bigdata/hadoop-2.7.7

vi etc/hadoop/core-site.xml

<configuration>

    <property>

        <name>fs.defaultFS</name>

        <value>hdfs://master:9000</value>

    </property>

</configuration>

 

vi etc/hadoop/hdfs-site.xml

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

</configuration>

 

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

  $ ssh master

If you cannot ssh to localhost without a passphrase, execute the following commands:

  $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  $ chmod 0600 ~/.ssh/authorized_keys

 

Execution

The following instructions are to run a MapReduce job locally. If you want to execute a job on YARN, see YARN on Single Node.

Format the filesystem:

  $ bin/hdfs namenode -format

Start NameNode daemon and DataNode daemon:

  $ sbin/start-dfs.sh

The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).

 

Browse the web interface for the NameNode; by default it is available at:

NameNode - http://master:50070/

Make the HDFS directories required to execute MapReduce jobs:

  $ bin/hdfs dfs -mkdir /user

  $ bin/hdfs dfs -mkdir /user/spark

Copy the input files into the distributed filesystem:

  $ bin/hdfs dfs -put etc/hadoop /user/spark

Run some of the examples provided:

  $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar grep /user/spark/hadoop /user/spark/output 'dfs[a-z.]+'

 

 

Examine the output files: Copy the output files from the distributed filesystem to the local filesystem and examine them:

  $ bin/hdfs dfs -get /user/spark/output ./tmp/output

  $ cat ./tmp/output/*

or

View the output files on the distributed filesystem:

  $ bin/hdfs dfs -cat output/*

When you’re done, stop the daemons with:

  $ sbin/stop-dfs.sh

 

YARN on a Single Node

You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.

The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.

  1. Configure parameters as follows:etc/hadoop/mapred-site.xml:

cp mapred-site.xml.template mapred-site.xml

vi etc/hadoop/mapred-site.xml

<configuration>

    <property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

    </property>

</configuration>

vi etc/hadoop/yarn-site.xml

<configuration>

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>

</configuration>

 

  1. Start ResourceManager daemon and NodeManager daemon:

 $ sbin/start-yarn.sh

  1. Browse the web interface for the ResourceManager; by default it is available at:

ResourceManager - http://master:8088/

  1. Run a MapReduce job.
  2. When you’re done, stop the daemons with:

 $ sbin/stop-yarn.sh


Fully-Distributed Operation

For information on setting up fully-distributed, non-trivial clusters see Cluster Setup.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

吃火锅的胖纸

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值