先说一下架构
1.Hadoop搭建的是伪分布 版本2.5.0
2.Hadoop搭建在linux平台下
3.Windows8 平台下安装eclipse 通过mapreduce插件链接到linux平台下的hadoop
hadoop 伪分布配置文件说明 (注意这是linux机器上的,windows上可以将其配置拷贝过来覆盖,也可以不配置)
1.core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/data/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.1.12.195:9000</value>
</property>
</configuration>
2.hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/data/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/data/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
3.maped-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>dns:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>dns:19888</value>
</property>
</configuration>
4.yarn-stie.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>dns:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>dns:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>dns:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>dns:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>dns:8088</value>
</property>
</configuration>
插件安装(插件的编译前边已经讲过)
windows 上配置
1.JAVA_HOME
2.HADOOP_HOME (注意要配置,不然是无法运行mapreduce的)
3.PATH中添加上述两个home的bin 例如 %HADOOP_HOME%\bin;
3.到github下载 https://github.com/srccodes/hadoop-common-2.2.0-bin 将其解压,并把其解压后的bin目录,替换你的windows的hadoop的bin目录
4,注意在你的项目中添加log4j.properties配置,方便调试
log4j.rootLogger=debug,stdout,R
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p - %m%n
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=mapreduce_test.log
log4j.appender.R.MaxFileSize=1MB
log4j.appender.R.MaxBackupIndex=1
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%p %t %c - %m%n
log4j.logger.com.codefutures=DEBUG
5.以上完成就OK