初识Hadoop

Hadoop权威指南第四版第二章代码运行

在github上拿到本书源代码hadoop-book-master后按照项目中的Readme里面的步骤进行jar包准备工作,具体内容不细说。

export HADOOP_CLASSPATH=hadoop-examples.jar

hadoop MaxTemperature input/ncdc/sample.txt output

上面两行代码是用于测试执行MapReduce作业,该操作要进入到hadoop-book-master文件夹根目录下进行

在进行该操作之前,要进入hadoop/sbin文件下打开Hadoop服务。

在运行上述代码的时候首先遇到过无法远程连接的问题,也遇到可能是yarn的问题,百度谷歌都没有结果,但是遗憾的是不知道这些问题最后是怎么解决的,再次格式化之后运行就变成了另外的问题。

倒数第二个问题是出现There are 0 datanode(s) running and no node(s) are excluded in this operation.问题,这个问题的出现原因是因为二次或多次运行存在版本残留,解决方法是将tmp包下的hfds包下的data里的current文件删除,然后执行hadoop namenode -format进行格式化操作之后,重启Hadoop服务,用start-dfs.sh命令和start-yarn.sh命令。

最后一个问题是user/saikikky/input/ncdc文件不存在,解决方法是用hadoop fs -mkdir -p /user/saikikky/input在hdfs下生成input文件夹,然后hadoop fs -put /Users/saikikky/Documents/专业学习/hadoop-book-master/input/ncdc /user/saikikky/input 命令将源代码中的input/ncdc文件放入刚生成的input文件夹。hadoop fs -ls /user/saikikky可以用来查看生成的文件是否存在。

最后运行上述两行代码就能正常运行

运行结果如下

Wangsiqi:hadoop-book-master saikikky$ export HADOOP_CLASSPATH=hadoop-examples.jar
Wangsiqi:hadoop-book-master saikikky$ hadoop MaxTemperature input/ncdc/sample.txt output
18/11/14 20:29:46 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
18/11/14 20:29:55 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
18/11/14 20:29:55 INFO input.FileInputFormat: Total input files to process : 1
18/11/14 20:29:55 INFO mapreduce.JobSubmitter: number of splits:1
18/11/14 20:29:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1542196849893_0002
18/11/14 20:29:56 INFO impl.YarnClientImpl: Submitted application application_1542196849893_0002
18/11/14 20:29:56 INFO mapreduce.Job: The url to track the job: http://10.12.137.232:8088/proxy/application_1542196849893_0002/
18/11/14 20:29:56 INFO mapreduce.Job: Running job: job_1542196849893_0002
18/11/14 20:30:03 INFO mapreduce.Job: Job job_1542196849893_0002 running in uber mode : false
18/11/14 20:30:03 INFO mapreduce.Job:  map 0% reduce 0%
18/11/14 20:30:07 INFO mapreduce.Job:  map 100% reduce 0%
18/11/14 20:30:12 INFO mapreduce.Job:  map 100% reduce 100%
18/11/14 20:30:12 INFO mapreduce.Job: Job job_1542196849893_0002 completed successfully
18/11/14 20:30:12 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=61
		FILE: Number of bytes written=315039
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=651
		HDFS: Number of bytes written=17
		HDFS: Number of read operations=6
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2044
		Total time spent by all reduces in occupied slots (ms)=2139
		Total time spent by all map tasks (ms)=2044
		Total time spent by all reduce tasks (ms)=2139
		Total vcore-milliseconds taken by all map tasks=2044
		Total vcore-milliseconds taken by all reduce tasks=2139
		Total megabyte-milliseconds taken by all map tasks=2093056
		Total megabyte-milliseconds taken by all reduce tasks=2190336
	Map-Reduce Framework
		Map input records=5
		Map output records=5
		Map output bytes=45
		Map output materialized bytes=61
		Input split bytes=122
		Combine input records=0
		Combine output records=0
		Reduce input groups=2
		Reduce shuffle bytes=61
		Reduce input records=5
		Reduce output records=2
		Spilled Records=10
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=80
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0
		Total committed heap usage (bytes)=319815680
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=529
	File Output Format Counters 
		Bytes Written=17

 用hadoop fs -cat /user/saikikky/output/part-r-00000命令查看最后输出的output文件,查看的结果如下

Wangsiqi:hadoop-book-master saikikky$ hadoop fs -ls /user/saikikky/output
Found 2 items
-rw-r--r--   1 saikikky supergroup          0 2018-11-14 20:30 /user/saikikky/output/_SUCCESS
-rw-r--r--   1 saikikky supergroup         17 2018-11-14 20:30 /user/saikikky/output/part-r-00000
Wangsiqi:hadoop-book-master saikikky$ hadoop fs -cat /user/saikikky/output/part-r-00000
1949	111
1950	22

 成功运行

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值