hadoop安装详细步骤win7_Win7下面安装hadoop2.x插件及Win7/Linux运行MapReduce程序

最新推荐文章于 2024-07-19 00:36:58 发布

艾团长

最新推荐文章于 2024-07-19 00:36:58 发布

阅读量292

点赞数

文章标签： hadoop安装详细步骤win7

本文链接：https://blog.csdn.net/weixin_34439035/article/details/113015267

版权

一、win7下

(一)、安装环境及安装包

win7 32 bit

jdk7

eclipse-java-juno-SR2-win32.zip

hadoop-2.2.0.tar.gz

hadoop-eclipse-plugin-2.2.0.jar

hadoop-common-2.2.0-bin.rar

(二)、安装

默认已经安装好了jdk、eclipse以及配置好了hadoop伪分布模式

1、拷贝hadoop-eclipse-plugin-2.2.0.jar插件到Eclipse安装目录的子目录plugins下，重启Eclipse。

2、设置环境变量

3、配置eclipse中hadoop的安装目录

解压hadoop-2.2.0.tar.gz

4、解压hadoop-common-2.2.0-bin.rar

复制里面的文件到hadoop安装目录的bin文件夹下

(三)、在win7下，MapReuce On Yarn执行

新建一个工程

点击window–>show view–>Map/Reduce Locations

点击New Hadoop Location……

添加如下配置，点击完成。

自此，你就可以查看HDFS中的相关内容了。

编写mapreduce程序

在src目录下添加文件log4j.properties，内容如下：

log4j.rootLogger=debug,appender1

log4j.appender.appender1=org.apache.log4j.ConsoleAppender

log4j.appender.appender1.layout=org.apache.log4j.TTCCLayout

运行，结果如下：

(一)在Linux下，MapReuce On Yarn上

运行

[root@liguodong Documents]# yarn jar test.jar hdfs://liguodong:8020/hello hdfs://liguodong:8020/output

15/05/03 03:16:12 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032

………………

15/05/03 03:16:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1430648117067_0001

15/05/03 03:16:13 INFO impl.YarnClientImpl: Submitted application application_1430648117067_0001 to ResourceManager at /0.0.0.0:8032

15/05/03 03:16:13 INFO mapreduce.Job: The url to track the job: http://liguodong:8088/proxy/application_1430648117067_0001/

15/05/03 03:16:13 INFO mapreduce.Job: Running job: job_1430648117067_0001

15/05/03 03:16:21 INFO mapreduce.Job: Job job_1430648117067_0001 running in uber mode : false

15/05/03 03:16:21 INFO mapreduce.Job: map 0% reduce 0%

15/05/03 03:16:40 INFO mapreduce.Job: map 100% reduce 0%

15/05/03 03:16:45 INFO mapreduce.Job: map 100% reduce 100%

15/05/03 03:16:45 INFO mapreduce.Job: Job job_1430648117067_0001 completed successfully

15/05/03 03:16:45 INFO mapreduce.Job: Counters: 43

File System Counters

FILE: Number of bytes read=98

FILE: Number of bytes written=157289

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

HDFS: Number of bytes read=124

HDFS: Number of bytes written=28

HDFS: Number of read operations=6

HDFS: Number of large read operations=0

HDFS: Number of write operations=2

Job Counters

Launched map tasks=1

Launched reduce tasks=1

Data-local map tasks=1

Total time spent by all maps in occupied slots (ms)=16924

Total time spent by all reduces in occupied slots (ms)=3683

Map-Reduce Framework

Map input records=3

Map output records=6

Map output bytes=80

Map output materialized bytes=98

Input split bytes=92

Combine input records=0

Combine output records=0

Reduce input groups=4

Reduce shuffle bytes=98

Reduce input records=6

Reduce output records=4

Spilled Records=12

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=112

CPU time spent (ms)=12010

Physical memory (bytes) snapshot=211070976

Virtual memory (bytes) snapshot=777789440

Total committed heap usage (bytes)=130879488

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=32

File Output Format Counters

Bytes Written=28

查看结果

[root@liguodong Documents]# hdfs dfs -ls /

Found 3 items

-rw-r--r-- 2 root supergroup 32 2015-05-03 03:15 /hello

drwxr-xr-x - root supergroup 0 2015-05-03 03:16 /output

drwx------ - root supergroup 0 2015-05-03 03:16 /tmp

[root@liguodong Documents]# hdfs dfs -ls /output

Found 2 items

-rw-r--r-- 2 root supergroup 0 2015-05-03 03:16 /output/_SUCCESS

-rw-r--r-- 2 root supergroup 28 2015-05-03 03:16 /output/part-r-00000

[root@liguodong Documents]# hdfs dfs -text /output/pa*

hadoop 1

hello 3

me 1

you 1

遇到的问题

File /output/……… could only be replicated to 0 nodes instead of minReplication (=1).

There are 1 datanode(s) running and no node(s) are excluded in this operation.

在网上找了很多方法是试了没有解决，然后自己根据这句话的中文意思是只有被复制到0个副本，而不是最少的一个副本。

我将最先dfs.replication.min设置为0，但是很遗憾，后面运行之后发现必须大于0，我又改为了1。

然后再dfs.datanode.data.dir多设置了几个路径，就当是在一个系统中多次备份吧，后面发现成功了。

设置如下，在hdfs-site.xml中添加如下配置。

dfs.datanode.data.dir

file://${hadoop.tmp.dir}/dfs/dn,file://${hadoop.tmp.dir}/dfs/dn1,file://${hadoop.tmp.dir}/dfs/dn2

(二)在Linux下，MapReuce On Local上

在mapred-site.xml中，添加如下配置文件。

mapreduce.framework.name

local

可以不用启动ResourceManager和NodeManager。

运行

[root@liguodong Documents]# hadoop jar test.jar hdfs://liguodong:8020/hello hdfs://liguodong:8020/output

三、MapReduce运行模式有多种

mapred-site.xml中

1)本地运行模式(默认)

mapreduce.framework.name

local

2)运行在YARN上

mapreduce.framework.name

yarn

四、Uber Mode

Uber Mode是针对于在Hadoop2.x中，对于MapReuduce Job小作业来说的一种优化方式(重用JVM的方式)。

小作业指的是MapReduce Job 运行处理的数据量，当数据量(大小)小于 HDFS 存储数据时block的大小(128M)。

默认是没有启动的。

mapred-site.xml中

mapreduce.job.ubertask.enable

true

艾团长

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫