(三) Hadoop运行模式

最新推荐文章于 2024-03-04 06:00:00 发布

啊了个呜

最新推荐文章于 2024-03-04 06:00:00 发布

阅读量625

点赞数 3

分类专栏：大数据文章标签：分布式 hadoop 大数据

本文链接：https://blog.csdn.net/weixin_46129834/article/details/105796344

版权

大数据专栏收录该内容

12 篇文章 0 订阅

订阅专栏

一. Hadoop运行模式有三种：

本地模式：这是默认模式，不需要启用单独的进程，直接可以运行，测试和开发时使用。
伪分布模式：等同于完全分布式，但是只有一个节点
完全分布式模式：多个节点一起运行

二. 本地模式运行案例

官方grep案例
1. 在hadoop-2.7.2文件夹下创建input文件夹
```
[starfish@hadoop100 hadoop-2.7.2]$mkdir input
```
2. 将hadoop的xml配置文件复制到input(只是作为一个众多数据，也可以自己导自己写的文件)
```
[starfish@hadoop100 hadoop-2.7.2]$cp etc/hadoop/*.xml input
```
3. 执行share目录下的mapreduce程序
```
[starfish@hadoop100 hadoop-2.7.2]bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
```
  上面的命令的意思就是说，运行bin/hadoop，执行share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar里面的grep, 查找input文件夹下面符合'dfs[a-z.]+'这个正则规则的字段，然后将结果输出到当前目录下的output文件夹下（output文件夹是自动生成的）
4. 查看输出结果
```
[starfish@hadoop100 hadoop-2.7.2]cat output/*
```
官方wordcount案例
1. 在hadoop-2.7.2文件夹下创建一个wcinput文件夹
```
[starfish@hadoop100 hadoop-2.7.2]$mkdir wcinput
```
2. 在wcinput文件夹下创建一个wc.input文件
```
[starfish@hadoop100 hadoop-2.7.2]$cd wcinput
[starfish@hadoop100 wcinput]$touch wc.input
```
3. 编辑wc.input
```
[starfish@hadoop100 wcinput]$vim wc.input
在文件中输入如下内容
hadoop yarn
hadoop mapreduce 
hello
you
you

保存退出：：wq
```
4. 回到/opt/module/hadoop-2.7.2文件夹下执行程序
```
[starfish@hadoop100 hadoop-2.7.2]$hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount wcinput wcoutput
```
  上面命令的意思就是：运行hadoop, 执行share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar 里面的wordcount, 需要做wordcount的数据在wcinput文件夹下，最后将结果输出到当前目录的wcoutput里面。

三. 伪分布式运行案例

伪分布式分为在HDFS上运行MapReduce程序和在YARN上运行MapReduce程序

在HDFS上运行MapReduce程序

(1) 首先要保证这台虚拟机安装了Hadoop和jdk,并且为hadoop和jdk配置了环境变量

(2)配置一些文件

hadoop-env.sh里面修改JAVA_HOME路径：

export JAVA_HOME=/opt/module/jdk1.7.0_79

core-site.xml

<!-- 指定HDFS中NameNode的地址 -->
<property>
	<name>fs.defaultFS</name>
    <value>hdfs://hadoop100:8020</value>
</property>

<!-- 指定hadoop运行时产生文件的存储目录 -->
<property>
	<name>hadoop.tmp.dir</name>
	<value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>

hdfs-site.xml

<!-- 指定HDFS副本的数量 -->
	<property>
		<name>dfs.replication</name>
		<value>1</value>
	</property>

(3) 格式化Namenode

[starfish@hadoop100 hadoop-2.7.2]$bin/hdfs namenode -format

注意：第一次启动时格式化，以后不要总是都格式化。

(4)启动namenode和datanode

#启动namenode
[starfish@hadoop100 hadoop-2.7.2]$sbin/hadoop-daemon.sh start namenode
#用jps查看是否启动，出现namenode就表示已启动
[starfish@hadoop100 hadoop-2.7.2]$jps
#启动datanode
[starfish@hadoop100 hadoop-2.7.2]$sbin/hadoop-daemon.sh start datanode
[starfish@hadoop100 hadoop-2.7.2]$jps

(5)在web增删查(不能修改)HDFS文件系统

在浏览器输入URL: http://192.168.10.100:50070/dfshealth.html#tab-overview

就可以看到HDFS文件系统。

(6)在hdfs上运行mapreduce程序步骤

在hdfs文件系统上创建一个input文件

[starfish@hadoop100 hadoop-2.7.2]$bin/hdfs dfs -mkdir -p /user/atguigu/mapreduce/wordcount/input

将本地的文件夹上传到hdfs文件系统上

[starfish@hadoop100 hadoop-2.7.2]$bin/hdfs dfs -put wcinput/wc.input  /user/atguigu/mapreduce/wordcount/input/

在hdfs上运行mapreduce程序

[starfish@hadoop100 hadoop-2.7.2]$bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/atguigu/mapreduce/wordcount/input/ /user/atguigu/mapreduce/wordcount/output

查看结果
```
[starfish@hadoop100 hadoop-2.7.2]$bin/hdfs dfs -cat /user/atguigu/mapreduce/wordcount/output/*
```
或者可以直接选择在网页上面查看
web端查看HDFS文件系统：
http://192.168.1.101:50070/dfshealth.html#tab-overview
$[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-bFhD7L3O-1587981692481)(C:\Users\user\AppData\Roaming\Typora\typora-user-images\image-20200427175348877.png)]$

删除结果的命令

[starfish@hadoop100 hadoop-2.7.2]$hdfs dfs -rmr /user/atguigu/mapreduce/wordcount/output

在YARN上运行MapReduce程序

L3O-1587981692481)]

删除结果的命令

[starfish@hadoop100 hadoop-2.7.2]$hdfs dfs -rmr /user/atguigu/mapreduce/wordcount/output

在YARN上运行MapReduce程序

未完待续，Hia~

啊了个呜

关注

3
点赞
踩
0

收藏

觉得还不错? 一键收藏
2
评论
(三) Hadoop运行模式

(三) Hadoop运行模式一. Hadoop运行模式有三种：本地模式：这是默认模式，不需要启用单独的进程，直接可以运行，测试和开发时使用。伪分布模式：等同于完全分布式，但是只有一个节点完全分布式模式：多个节点一起运行二. 本地模式运行案例官方grep案例在hadoop-2.7.2文件夹下创建input文件夹[starfish@hadoop100 hado...
复制链接

扫一扫

专栏目录