Hadoop单节点集群安装配置,运行示例wordcount程序

HADOOP伪分布式(单节点集群搭建)

hadoop官网下载地址
https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz
最好下稳定版的,之前装的最新版出了点问题,不确定和版本之间有没有必然的关系,反正换回稳定版之后好像就少了些问题吧,记不清了。
单机集群搭建,还是官方的教程最权威
http://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/SingleCluster.html
单节点就是伪分布式,官方文档:Hadoop可以在单节点上以所谓的伪分布式模式运行,此时每一个Hadoop守护进程都作为一个独立的Java进程运行。
http://hadoop.apache.org/docs/r1.0.4/cn/quickstart.html#运行Hadoop集群的准备工作
搭建完之后
格式化namenode
hdfs namenode -format
(namenode只需要格式化一次就好,不用每次都格式化,重复格式化namenode版本不一致,namenode会启动不起来,文末有解决方案)
启动namenode和datanode
start-dfs.sh
jps查看jvm进程信息

16098 NameNode
16245 DataNode
16437 SecondaryNameNode
16590 Jps

(之前是没有看到datanode的进程的,然后修改了hadoop安装目录下etc/hadoop/hdfs-site.xml

<configuration>
       <property>
                <name>dfs.namenode.name.dir</name>
                <value>/root/clound/hadoop/data/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/root/clound/hadoop/data/data</value>
        </property>
   <property>
    <name>fs.default.name</name>
    <value>localhost:9000</value>
  </property>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
      <property>
        <name>dfs.replication</name>
        <value>1</value>
       </property>
<!--配置HDFS的权限-->
   <property>
     <name>dfs.permissions</name>
     <value>false</value>
   </property>

加入了namenode和datanode的目录,这个目录是我自己在安装目录下面手动创建的。
参考:https://blog.csdn.net/weixin_35353187/article/details/81779973)
启动ResourceManager和NodeManger
start-yarn.sh
jps
25078 ResourceManager
25271 NodeManager
13352 SecondaryNameNode
26536 Jps
17530 DataNode
13038 NameNode
然后可以测试一下系统自带的wordcount例子
先在hdfs上创建一个你自己的文件路径
hdfs dfs -mkdir -p /flower/hadoop/input
编写一个测试文件,用来之后测试里面的单词数
vim testHadoop.txt
上传到hdfs上
hadoop fs -put testHadoop.txt /flower/hadoop/input
hadoop fs -ls /flower/hadoop/input

运行hadoop自带的wordcount例子

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount /flower/hadoop/input/testHadoop.txt /flower/hadoop/output

这里的wordcount是这个示例mappreduce的类名

最后运行成功之后的输出

[root@flower-server hadoop]# hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.2.jar wordcount /flower/hadoop/input/testHadoop.txt /flower/hadoop/output
19/01/22 21:54:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/01/22 21:54:55 INFO input.FileInputFormat: Total input files to process : 1
19/01/22 21:54:56 INFO mapreduce.JobSubmitter: number of splits:1
19/01/22 21:54:57 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
19/01/22 21:54:57 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1548165280849_0001
19/01/22 21:54:58 INFO impl.YarnClientImpl: Submitted application application_1548165280849_0001
19/01/22 21:54:58 INFO mapreduce.Job: The url to track the job: http://izbp1czpl17je74lb8g7gbz:8088/proxy/application_1548165280849_0001/
19/01/22 21:54:58 INFO mapreduce.Job: Running job: job_1548165280849_0001
19/01/22 21:55:12 INFO mapreduce.Job: Job job_1548165280849_0001 running in uber mode : false
19/01/22 21:55:12 INFO mapreduce.Job:  map 0% reduce 0%
19/01/22 21:55:19 INFO mapreduce.Job:  map 100% reduce 0%
19/01/22 21:55:27 INFO mapreduce.Job:  map 100% reduce 100%
19/01/22 21:55:28 INFO mapreduce.Job: Job job_1548165280849_0001 completed successfully
19/01/22 21:55:28 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=184
		FILE: Number of bytes written=397769
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=236
		HDFS: Number of bytes written=130
		HDFS: Number of read operations=6
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=4741
		Total time spent by all reduces in occupied slots (ms)=5016
		Total time spent by all map tasks (ms)=4741
		Total time spent by all reduce tasks (ms)=5016
		Total vcore-milliseconds taken by all map tasks=4741
		Total vcore-milliseconds taken by all reduce tasks=5016
		Total megabyte-milliseconds taken by all map tasks=4854784
		Total megabyte-milliseconds taken by all reduce tasks=5136384
	Map-Reduce Framework
		Map input records=7
		Map output records=14
		Map output bytes=167
		Map output materialized bytes=184
		Input split bytes=125
		Combine input records=14
		Combine output records=12
		Reduce input groups=12
		Reduce shuffle bytes=184
		Reduce input records=12
		Reduce output records=12
		Spilled Records=24
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=250
		CPU time spent (ms)=1300
		Physical memory (bytes) snapshot=366415872
		Virtual memory (bytes) snapshot=4209389568
		Total committed heap usage (bytes)=165810176
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=111
	File Output Format Counters 
		Bytes Written=130

可是我这里显示的观察mapreduce job的url还是访问不了,暂时还不知道怎么回事。
[root@flower-server hadoop]# hadoop fs -ls /flower/hadoop/output
Found 2 items
-rw-r–r-- 1 root supergroup 0 2019-01-22 21:55 /flower/hadoop/output/_SUCCESS
-rw-r–r-- 1 root supergroup 130 2019-01-22 21:55 /flower/hadoop/output/part-r-00000
[root@flower-server hadoop]# hadoop fs -cat /flower/hadoop/output/part-r-00000
am 2
best! 1
comming… 1
do 1
hadoop 1
haha 1
i 2
konw? 1
the 1
you 1
我就知道我是最聪明的! 1
我是最棒的! 1
[root@flower-server hadoop]# hadoop fs -cat /flower/hadoop/input/testHadoop.txt
haha
我就知道我是最聪明的!
我是最棒的!
i am the best!
do you konw?
hadoop
i am comming…

hadoop namenode多次格式化namenode之后datanode无法启动

可以将hadoop安装目录下的etc下的core-site.xml中配置的hadoop.tmp.dir目录清空,再将hdfs-site.xml中配置的namenode和datenode目录也清空再重新格式化namenode即可。

Hadoop failed on connection exception: java.net.ConnectException: Connection

将各个配置文件中的localhost替换为真实的本机ip地址
可以使用ifconfig查看

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值