Hadoop本地运行模式

Hadoop运行模式包括

本地模式、伪分布式模式以及完全分布式模式
Hadoop官方网站:http://hadoop.apache.org/


本地运行模式
1、创建在hadoop-2.7.2文件下面创建一个input文件夹

[root@localhost hadoop-2.7.2]# mkdir input
[root@localhost hadoop-2.7.2]# 

2、 将Hadoop的xml配置文件复制到input

[root@localhost hadoop-2.7.2]# cp etc/hadoop/*.xml input
[root@localhost hadoop-2.7.2]# 

3、执行share目录下的MapReduce程序

[root@localhost hadoop-2.7.2]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar  grep input output 'dfs[a-z.]+'


		Bytes Read=123
	File Output Format Counters 
		Bytes Written=23
[root@localhost hadoop-2.7.2]# 

4、查看输出结果

[root@localhost hadoop-2.7.2]# cat output/*
1	dfsadmin
[root@localhost hadoop-2.7.2]# 

官方WordCount案例

1、创建在hadoop-2.7.2文件下面创建一个wcinput文件夹

[root@localhost hadoop-2.7.2]# mkdir wcinput
[root@localhost hadoop-2.7.2]# 

2、在wcinput文件下创建一个wc.input文件

[root@localhost hadoop-2.7.2]# cd wcinput
[root@localhost wcinput]# touch wc.input
[root@localhost wcinput]# 

3、编辑wc.input文件

[root@localhost wcinput]# vim /wc.input

hadoop yarn
hadoop mapreduce
atguigu
atguigu

保存退出::wq

4、回到Hadoop目录/opt/module/hadoop-2.7.2

[root@localhost wcinput]# cd ..
[root@localhost hadoop-2.7.2]# 

5、执行程序

[root@localhost hadoop-2.7.2]# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount wcinput wcountput

............
............

	
Bytes Written=50

[root@localhost hadoop-2.7.2]# 

6、查看结果

[root@localhost hadoop-2.7.2]# cat wcoutput/
cat: wcoutput/: No such file or directory
[root@localhost hadoop-2.7.2]# cat wcoutput/*
cat: wcoutput/*: No such file or directory
[root@localhost hadoop-2.7.2]# ll


[root@localhost hadoop-2.7.2]# cp wc.input wcinput/

cp: cannot stat ‘wc.input’: No such file or directory

[root@localhost hadoop-2.7.2]# cd wcinput/
[root@localhost wcinput]# ll
total 0
-rw-r--r--. 1 root root 0 Nov 14 23:34 wc.input
[root@localhost wcinput]# ls
wc.input
[root@localhost wcinput]# cd ..
[root@localhost hadoop-2.7.2]# cd wcountput/
[root@localhost wcountput]# cd ..
[root@localhost hadoop-2.7.2]# rm wcountput/
rm: cannot remove ‘wcountput/’: Is a directory
[root@localhost hadoop-2.7.2]# rm wcountput/ -rf
[root@localhost hadoop-2.7.2]# 

6、查看结果

[root@localhost hadoop-2.7.2]# cat wcountput/*
atguigu	2
hadoop	2
mapreduce	1
yarn	1
[root@localhost hadoop-2.7.2]# 


伪分布式运行模式

  1. 分析
    (1)配置集群
    (2)启动、测试集群增、删、查
    (3)执行WordCount案例

  2. 执行步骤
    (1)配置集群
    (a)配置:hadoop-env.sh
    首先,我们先查看hadoop的配置文件是在 hadoop-2.7.2/etc/hadoop目录下的

[root@localhost hadoop-2.7.2]# ll
total 28
drwxr-xr-x. 2 root root   194 May 22  2017 bin
drwxr-xr-x. 3 root root    20 May 22  2017 etc
drwxr-xr-x. 2 root root   106 May 22  2017 include
drwxr-xr-x. 2 root root   187 Nov 14 23:13 input
drwxr-xr-x. 3 root root    20 May 22  2017 lib
drwxr-xr-x. 2 root root   239 May 22  2017 libexec
-rw-r--r--. 1 root root 15429 May 22  2017 LICENSE.txt
-rw-r--r--. 1 root root   101 May 22  2017 NOTICE.txt
drwxr-xr-x. 2 root root    88 Nov 14 23:25 output
-rw-r--r--. 1 root root  1366 May 22  2017 README.txt
drwxr-xr-x. 2 root root  4096 May 22  2017 sbin
drwxr-xr-x. 4 root root    31 May 22  2017 share
drwxr-xr-x. 2 root root    22 Nov 15 00:00 wcinput
drwxr-xr-x. 2 root root    88 Nov 15 00:01 wcountput
[root@localhost hadoop-2.7.2]# cd etc/hadoop/
[root@localhost hadoop]# ls
capacity-scheduler.xml  hadoop-env.sh               httpfs-env.sh            kms-env.sh            mapred-env.sh               ssl-server.xml.example
configuration.xsl       hadoop-metrics2.properties  httpfs-log4j.properties  kms-log4j.properties  mapred-queues.xml.template  yarn-env.cmd
container-executor.cfg  hadoop-metrics.properties   httpfs-signature.secret  kms-site.xml          mapred-site.xml.template    yarn-env.sh
core-site.xml           hadoop-policy.xml           httpfs-site.xml          log4j.properties      slaves                      yarn-site.xml
hadoop-env.cmd          hdfs-site.xml               kms-acls.xml             mapred-env.cmd        ssl-client.xml.example
[root@localhost hadoop]# 

需要配置文件,如图所示:
在这里插入图片描述
首先,先对core-site.xml文件进行配置
1、编辑core-site.xml文件,命令:vim core-site.xml

[root@localhost hadoop]# vim core-site.xml

<configuration>

<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
    <value>hdfs://hadoop101:9000</value>
</property>

<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/hadoop/module/hadoop-2.7.2/data</value>
</property>
</configuration>

如图:
在这里插入图片描述
接着,配置hdfs-site.xml

vim hdfs-site.xml

[root@localhost hadoop]# vim hdfs-site.xml


<!-- 指定HDFS副本的数量 -->
<property>
        <name>dfs.replication</name>
        <value>1</value>
</property>

<property>
   <name>dfs.namenode.http.address</name>
   <value>slave1:50070</value>

</property>

hadoop配置的是单一节点,如图所示:
在这里插入图片描述
当我们把这两个配置文件配置号以后,启动集群,需要对NameNode做格式化
注意:(第一次启动时格式化,以后就不要总格式化)

[root@localhost hadoop-2.7.2]# bin/hdfs namenode -format

19/11/15 09:39:29 INFO namenode.NameNode: STARTUP_MSG: 

当我们格式化完成之后,就可以启动NameNode

在这里插入[root@localhost hadoop-2.7.2]# sbin/hadoop-daemon.sh start namenode

starting namenode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-MissZhou-namenode-localhost.localdomain.out
[root@localhost hadoop-2.7.2]# 

当我们启动完成之后,查看是否启动成功,执行命令:jps

[root@localhost hadoop-2.7.2]# jps
9526 Jps
[root@localhost hadoop-2.7.2]# 

再启动DataNode

[root@localhost hadoop-2.7.2]# sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/hadoop/module/hadoop-2.7.2/logs/hadoop-MissZhou-namenode-localhost.localdomain.out
[root@localhost hadoop]# jps
13104 Jps
12711 DataNode
12871 SecondaryNameNode
12587 NameNode
[root@localhost hadoop]# 

注意:jps是JDK中的命令,不是Linux命令。不安装JDK不能使用jps

web端查看HDFS文件系统
在这里插入图片描述
查看产生的Log日志
说明:在企业中遇到Bug时,经常根据日志提示信息去分析问题、解决Bug。
当前目录:/opt/module/hadoop-2.7.2/logs

[root@localhost hadoop-2.7.2]# ll
total 32
drwxr-xr-x. 2 root     root   194 May 22  2017 bin
drwxr-xr-x. 4 root     root    28 Nov 15 09:39 data
drwxr-xr-x. 3 root     root    20 May 22  2017 etc
drwxr-xr-x. 2 root     root   106 May 22  2017 include
drwxr-xr-x. 2 root     root   187 Nov 14 23:13 input
drwxr-xr-x. 3 root     root    20 May 22  2017 lib
drwxr-xr-x. 2 root     root   239 May 22  2017 libexec
-rw-r--r--. 1 root     root 15429 May 22  2017 LICENSE.txt
drwxr-xr-x. 2 MissZhou root  4096 Nov 15 10:33 logs
-rw-r--r--. 1 root     root   101 May 22  2017 NOTICE.txt
drwxr-xr-x. 2 root     root    88 Nov 14 23:25 output
-rw-r--r--. 1 root     root  1366 May 22  2017 README.txt
drwxr-xr-x. 2 root     root  4096 May 22  2017 sbin
drwxr-xr-x. 4 root     root    31 May 22  2017 share
drwxr-xr-x. 2 root     root    22 Nov 15 00:00 wcinput
drwxr-xr-x. 2 root     root    88 Nov 15 00:01 wcountput
[root@localhost hadoop-2.7.2]# cd logs
[root@localhost logs]# ls
hadoop-MissZhou-namenode-localhost.localdomain.log    hadoop-root-datanode-localhost.localdomain.out.1         hadoop-root-secondarynamenode-localhost.localdomain.out
hadoop-MissZhou-namenode-localhost.localdomain.out    hadoop-root-namenode-localhost.localdomain.log           hadoop-root-secondarynamenode-localhost.localdomain.out.1
hadoop-MissZhou-namenode-localhost.localdomain.out.1  hadoop-root-namenode-localhost.localdomain.out           SecurityAuth-root.audit
hadoop-root-datanode-localhost.localdomain.log        hadoop-root-namenode-localhost.localdomain.out.1
hadoop-root-datanode-localhost.localdomain.out        hadoop-root-secondarynamenode-localhost.localdomain.log
[root@localhost logs]# 

思考:为什么不能一直格式化NameNode,格式化NameNode,要注意什么?

[root@localhost hadoop-2.7.2]# cd logs
[root@localhost logs]# ls
hadoop-MissZhou-namenode-localhost.localdomain.log    hadoop-root-datanode-localhost.localdomain.out.1         hadoop-root-secondarynamenode-localhost.localdomain.out
hadoop-MissZhou-namenode-localhost.localdomain.out    hadoop-root-namenode-localhost.localdomain.log           hadoop-root-secondarynamenode-localhost.localdomain.out.1
hadoop-MissZhou-namenode-localhost.localdomain.out.1  hadoop-root-namenode-localhost.localdomain.out           SecurityAuth-root.audit
hadoop-root-datanode-localhost.localdomain.log        hadoop-root-namenode-localhost.localdomain.out.1
hadoop-root-datanode-localhost.localdomain.out        hadoop-root-secondarynamenode-localhost.localdomain.log
[root@localhost logs]# cd ..
[root@localhost hadoop-2.7.2]# cd data/tmp/dfs/name/current/
[root@localhost current]# cat VERSION
#Fri Nov 15 10:06:50 GMT 2019
namespaceID=2076116523
clusterID=CID-8346e188-fa8d-4890-972c-1c3129023816
cTime=0
storageType=NAME_NODE
blockpoolID=BP-1475381544-127.0.0.1-1573812410026
layoutVersion=-63
[root@localhost current]# 

注意:格式化NameNode,会产生新的集群id,导致NameNode和DataNode的集群id不一致,集群找不到已往数据。所以,格式NameNode时,一定要先删除data数据和log日志,然后再格式化NameNode。

在一次无法访问,其原因参照:
解决无法访问问题

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值