hadoop2.7.2学习笔记01-启动单节点集群

最新推荐文章于 2023-05-16 22:04:35 发布

流浪小肥羊

最新推荐文章于 2023-05-16 22:04:35 发布

阅读量555

点赞数 1

分类专栏： hadoop 学习笔记文章标签： hadoop

本文链接：https://blog.csdn.net/qq_34617750/article/details/70887322

版权

hadoop 同时被 2 个专栏收录

18 篇文章 0 订阅

订阅专栏

学习笔记

18 篇文章 0 订阅

订阅专栏

准备：在linux平台上部署单节点hadoop集群。要求安装有java和ssh。需要启动sshd 服务，请参考 sshd服务开启。

步骤：

1、在apache官网下载hadoop distribution，并解压。

2、在etc/hadoop/hadoop-env.sh文件中做如下的编辑：

export JAVA_HOME = /usr/java/latest（此处填写jdk路径）

3、执行在shell中执行bin/hadoop。返回的是hadoop脚本的文档：

[hadoop@ip-172-199-0-15 hadoop]$ bin/hadoop
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
  CLASSNAME            run the class named CLASSNAME
 or
  where COMMAND is one of:
  fs                   run a generic filesystem user client
  version              print the version
  jar <jar>            run a jar file
                       note: please use "yarn jar" to launch
                             YARN applications, not this command.
  checknative [-a|-h]  check native hadoop and compression libraries availability
  distcp <srcurl> <desturl> copy file or directories recursively
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  classpath            prints the class path needed to get the
  credential           interact with credential providers
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
  trace                view and modify Hadoop tracing settings

Most commands print help when invoked w/o parameters.

4、修改hadoop配置文件

etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

5、设置ssh免密码登录：

  $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
  $ chmod 0600 ~/.ssh/authorized_keys

测试方法：

$ ssh localhost

6、格式化文件系统：

 $ bin/hdfs namenode -format

7、启动hadoop的namenode和datanode守护进程

$ sbin/start-dfs.sh

8、应该是权限设置，说是为了执行mapreduce作业

  $ bin/hdfs dfs -mkdir /user
  $ bin/hdfs dfs -mkdir /user/<username>

9、将本地文件（etc/hadoop目录下的文件）上传到dfs

$ bin/hdfs dfs -put etc/hadoop input

10、执行一个简单的例子，这个例子应该是hadoop开发人员编写的。该示例具体有啥作用尚不明，但是可以猜测执行结果保存在dfs的output目录。

 $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+'

11、从dfs下载输出文件到本地，并查看输出结果

 $ bin/hdfs dfs -get output output
 $ cat output/*

或者直接在dfs上查看结果

$ bin/hdfs dfs -cat output/*

12、使用yarn执行作业。启动ResourceManager和NodeManager守护进程。

  $ sbin/start-yarn.sh

13、查看ResourceManager的ui界面

访问http://localhost:8088/

14、关闭守护进程

 $ sbin/stop-yarn.sh

 $ sbin/stop-dfs.sh

流浪小肥羊

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录