编译、测试Hadoop Eclipse plugin



平台:

OS CentOSrelease 6.4 (Final) X64

Hadoop 1.1.3


描述:

使用ant 工具编译 Hadoop Eclipse 插件


一、操作所需步骤:


1、安装配置 ant

2、配置Hadoop 文件分别是:

$HADOOP_HOME/build.xml

$HADOOP_HOME/src/contrib/build-contrib.xml

$HADOOP_HOME/src/contrib/eclipse-plugin/build.xml

$HADOOP_HOME/src/contrib/eclipse-plugin/build.properties

$HADOOP_HOME/src/contrib/eclipse-plugin/META-INF/MANIFEST.MF

3、编译执行ant compile 命令

4、eclipse中加载插件

5、测试




一、安装ant


下载ant

http://ant.apache.org/bindownload.cgi


apache-ant-1.9.2-bin.tar.gz


解压指定目录

tar -zxvfapache-ant-1.9.2-bin.tar.gz -C /usr/ant/


编辑环境变量

export ANT_HOME=/usr/ant/apache-ant-1.9.2

追加 $ANT_HOME/bin PATH

exportPATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$ANT_HOME/bin


测试:

[root@master~]# ant -version

Apache Ant(TM)version 1.9.2 compiled on July 8 2013

[root@master~]#


二、配置 Hadoop 编译所需文件

1、编辑$HADOOP_HOME/build.xml


l修改31行,去掉红色字体部分

31 <property name="version" value="1.1.3-SNAPSHOT"/>


l注销这部分内容:

2421 <!--

2422 <targetname="ivy-download" description="To download ivy"unless="offline">

2423 <getsrc="${ivy_repo_url}" dest="${ivy.jar}"usetimestamp="true"/>

2424 </target>

2425 -->

如果不注销会在没编辑一步都会要下载ivy


l去掉对ivy-download的依赖关系,去掉红色字体部分


2427 <!--

2428 To avoid Ivy leaking things across bigprojects, always load Ivy in the same classloader.

2429 Also note how we skip loading Ivy if it isalready there, just to make sure all is well.

2430 -->

2431 <target name="ivy-init-antlib"depends="ivy-download,ivy-init-dirs,ivy-probe-antlib"unless="ivy.found">

2432 <typedefuri="antlib:org.apache.ivy.ant" οnerrοr="fail"

2433 loaderRef="ivyLoader">

2434 <classpath>



2、build-contrib.xml

文件所在目录:

$HADOOP_HOME/src/contrib/build-contrib.xml


添加:eclipse 主目录和hadoop版本,这里hadoop的版本和$HADOOP_HOME/build.xml中的版本一致


<property name="eclipse.home"location="/usr/eclipse"/>

<property name="version"value="1.1.3 "/>


放在红色字体位置,并修改修改locationhadoop主目录


<!--Imported by contrib/*/build.xml files to share generic targets. -->


<projectname="hadoopbuildcontrib"xmlns:ivy="antlib:org.apache.ivy.ant">


<propertyname="eclipse.home" location="/usr/eclipse"/>

<property name="version"value="1.1.3"/>



注意:

这里的"version"value版本必须与$HADOOP_HOME/build.xml中的"version" value版本一致

如果不一致在编译时会报如下错:

[hadoop@masterhadoop]$ ant compile

Buildfile:/usr/hadoop/build.xml


BUILD FAILED

Target"ivy-probe-antlib" does not exist in the project "Hadoop".It is used from target "ivy-init-antlib".


3、编辑文件build.xml


文件位置:


$HADOOP_HOME/src/contrib/eclipse-plugin/


插入红色字体部分内容


<!--Override jar target to specify manifest -->

<targetname="jar" depends="compile"unless="skip.contrib">

<mkdirdir="${build.dir}/lib"/>

<copyfile="${hadoop.root}/build/hadoop-core-${version}.jar"tofile="${build.dir}/lib/hadoop-core.jar"verbose="true"/>

<copyfile="${hadoop.root}/build/ivy/lib/Hadoop/common/commons-cli-${commons-cli.version}.jar" todir="${build.dir}/lib"verbose="true"/>

<copy file="${hadoop.root}/lib/commons-configuration-1.6.jar"tofile="${build.dir}/lib/commons-configuration-1.6.jar"verbose="true"/>

<copyfile="${hadoop.root}/lib/commons-httpclient-3.0.1.jar"tofile="${build.dir}/lib/commons-httpclient-3.0.1.jar"verbose="true"/>

<copy file="${hadoop.root}/lib/commons-lang-2.4.jar"tofile="${build.dir}/lib/commons-lang-2.4.jar"verbose="true"/>

<copy file="${hadoop.root}/lib/jackson-core-asl-1.8.8.jar"tofile="${build.dir}/lib/jackson-core-asl-1.8.8.jar"verbose="true"/>

<copyfile="${hadoop.root}/lib/jackson-mapper-asl-1.8.8.jar"tofile="${build.dir}/lib/jackson-mapper-asl-1.8.8.jar"verbose="true"/>

<jar

jarfile="${build.dir}/hadoop-${name}-${version}.jar"

manifest="${root}/META-INF/MANIFEST.MF">

<fileset dir="${build.dir}"includes="classes/ lib/"/>

<filesetdir="${root}" includes="resources/ plugin.xml"/>

</jar>

</

这里需要注意的是 copyfile 之间是有空格的,如果没有空间编译时候会失败。


4、编辑build.properties


文件所在位置:


$HADOOP_HOME/src/contrib/eclipse-plugin/


插入红色字体部分

[hadoop@master eclipse-plugin]$ catbuild.properties|less

output.. = bin/

bin.includes = META-INF/,\

plugin.xml,\

resources/,\

classes/,\

classes/,\

lib/


eclipse.home=/usr/eclipse


这里是eclipse.homeeclipse的安装目录为$eclipse_home


5、编辑MANIFEST.MF


文件位置:


$HADOOP_HOME/src/contrib/eclipse-plugin/META-INF.MF

[hadoop@master META-INF]$ vi MANIFEST.MF

Manifest-Version: 1.0

Bundle-ManifestVersion: 2

Bundle-Name: MapReduce Tools for Eclipse

Bundle-SymbolicName:org.apache.hadoop.eclipse;singleton:=true

Bundle-Version: 0.18

Bundle-Activator: org.apache.hadoop.eclipse.Activator

Bundle-Localization: plugin

Require-Bundle: org.eclipse.ui,

org.eclipse.core.runtime,

org.eclipse.jdt.launching,

org.eclipse.debug.core,

org.eclipse.jdt,

org.eclipse.jdt.core,

org.eclipse.core.resources,

org.eclipse.ui.ide,

org.eclipse.jdt.ui,

org.eclipse.debug.ui,

org.eclipse.jdt.debug.ui,

org.eclipse.core.expressions,

org.eclipse.ui.cheatsheets,

org.eclipse.ui.console,

org.eclipse.ui.navigator,

org.eclipse.core.filesystem,

org.apache.commons.logging

Eclipse-LazyStart: true

Bundle-ClassPath: classes/,lib/hadoop-core.jar,lib/commons-configuration-1.6.jar,lib/commons-httpclient-3.0.1.jar,lib/commons-lang-2.4.jar,lib/jackson-core-asl-1.8.8.jar,lib/jackson-mapper-asl-1.8.8.jar,lib/commons-cli-1.2.jar

Bundle-Vendor: Apache Hadoop

[hadoop@master META-INF]$


这里需要注意的是格式和英文字符“,”号;这里格式lib文件之间没有空格和回车符,如果格式不对虽然编译可以通过,但在eclipse中加载插件后创建NewHadoop Location 时没有反应。这个问题导致我花费了两天时间。


三、编译目录,生成插件


1、进入hadoop根目录中编译


[hadoop@master ~]$ cd $HADOOP_HOME

[hadoop@master hadoop]$


运行ant 编译hadoop


[hadoop@master hadoop]$ ant

Buildfile: /usr/hadoop/build.xml


clover.setup:


clover.info:

[echo]

[echo] Clover not found. Codecoverage reports disabled.

[echo]


clover:


ivy-init-dirs:


ivy-probe-antlib:


ivy-init-antlib:


ivy-init:

[ivy:configure] :: Ivy 2.1.0 - 20090925235825:: http://ant.apache.org/ivy/ ::

[ivy:configure] :: loading settings :: file =/usr/hadoop/ivy/ivysettings.xml


ivy-resolve-common:

[ivy:resolve] :: resolving dependencies ::org.apache.hadoop#Hadoop;working@master

[ivy:resolve] confs: [common]

[ivy:resolve] found commons-logging#commons-logging;1.0.4in maven2

[ivy:resolve] found log4j#log4j;1.2.15 in maven2

[ivy:resolve] foundcommons-httpclient#commons-httpclient;3.0.1 in maven2

[ivy:resolve] found commons-codec#commons-codec;1.4 inmaven2

[ivy:resolve] found commons-cli#commons-cli;1.2 in maven2

[ivy:resolve] found commons-io#commons-io;2.1 in maven2


………………


compile:


BUILD SUCCESSFUL

Total time: 1 minute 38 seconds

[hadoop@master hadoop]$


在编译过程中会有警告,这不影响。编译完成会在$HADOOP_HOME目录中多处一个build文件夹


2、进入$HADOOP_HOME/src/contrib/eclipse-plugin/目录


运行antjar 命令


说明:这一步是生成eclipse所需hadoop插件包


[hadoop@master eclipse-plugin]$ ant jar

Buildfile:/usr/hadoop/src/contrib/eclipse-plugin/build.xml


check-contrib:


init:

[echo] contrib: eclipse-plugin


init-contrib:



………………

[copy] Copying /usr/hadoop/lib/jackson-mapper-asl-1.8.8.jar to/usr/hadoop/build/contrib/eclipse-plugin/lib/jackson-mapper-asl-1.8.8.jar

[jar] Building jar: /usr/hadoop/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-1.1.3.jar


BUILD SUCCESSFUL

Total time: 10 seconds


编译成功后会在红色字体部分显示插件所在目录位置。


在这一步可以查看到如下信息:


[hadoop@master eclipse-plugin]$ hadoopversion

Hadoop 1.1.3

Subversion -r

Compiled by hadoop on Thu Oct 31 13:36:59 CST2013

From source with checksumc720ddcf4b926991de7467d253a79b8b

[hadoop@master eclipse-plugin]$


Hadoop的版本和编译日期。


四、配置eclipse


1、把编译成功后的插件拷贝到eclipse插件目录中。

cp$HADOOP_HOME/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-1.1.3.jar/usr/eclipse/plugins/


2、打开eclipse或者重新打开eclipse配置hadoop installation directory


打开Window-->Preferens,你会发现HadoopMap/Reduce选项,在这个选项里你需要配置Hadoopinstallation directory。这里的目录为hadoop的主目录:$HADOOP_HOME;配置完成后退出。

171856105.png



3、打开Map/Reduce视图,配置Map/ReduceLocations

Window->Open Perspective->Other ,弹出对话框列表中,会出现图标为蓝色大象,文字为Map/Reduce点击0K


171950537.png

Eclipse 下方会出现Map/ReduceLocation ,在空白区域点击鼠标右键出现新建“NewHadoop location


172021620.png


弹出编辑框,输入Map/ReduceMasterDFSMaster 参数,即为配置hadoop$HADOOP_HOME/conf 目录中core-site.xmlmapred-site.xml文件参数。


[hadoop@master conf]$ cat core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>


<!-- Put site-specific property overridesin this file. -->


<configuration>


<property>


<name>hadoop.tmp.dir</name>


<value>/usr/hadoop/tmp</value>


<description>A base for other temporary directories.</description>


</property>


<!-- file system properties -->


<property>


<name>fs.default.name</name>


<value>hdfs://192.168.10.243:9000</value>


</property>


</configuration>




[hadoop@master conf]$ cat mapred-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>


<!-- Put site-specific property overridesin this file. -->


<configuration>


<property>


<name>mapred.job.tracker</name>


<value>http://192.168.10.243:9001</value>


</property>


</configuration>

[hadoop@master conf]$


172138689.png



配置完成后会出现如下信息:


172220533.png


五、测试配置插件是否在eclipse中成功


使用wordcount.java来测试


1、测试准备


建文件夹input,并穿件两个文件test1.txttest2.txt


test1.txt 中存入数据内容:

hello word


test2.txt 中存入数据内容:

hello hadoop


将文件夹input复制到hdfsin目录

Hadoop dfs -put /home/hadoop/input in


查看in目录:


[hadoop@master ~]$ hadoop dfs -ls ./in/*

-rw-r--r-- 3 hadoop supergroup 112013-10-31 14:39 /user/hadoop/in/test1.txt

-rw-r--r-- 3 hadoop supergroup 132013-10-31 14:39 /user/hadoop/in/test2.txt


2、创建项目


File-->New-->Other-->Map/ReduceProject,项目名称为test(可以随便自行命名),

复制WordCount.java到新建项test中的src目录中。

cp $HADOOP_HOME/src/examples/org/apache/hadoop/examples/WordCount.java

/home/hadoop/workspace/test/src


3、编译运行WordCount


点击WordCount.java右键—>Run as -->配置Java Application àWordCount(1)-àArguments à输入:

hdfs://192.168.10.243:9000/user/hadoop/in hdfs://192.168.10.243:9000/user/hadoop/out

--àapply àrun

172316296.png


注意:hdfs://192.168.10.243:9000/user/hadoop/out outWordCount输出结果存放的目录。


查看结果:

方法一:

运行成功后可以查看DFSLocations hadoop目录中会多处一个out目录,并有两个文件part-开头的是存放WordCount的结果。点击可以到结果:


172353236.png

方法二:

使用hadoop命令:hadoopdfs –cat ./out/*


输入结果:

[hadoop@master~]$ hadoop dfs -cat ./out/*

hadoop 1

hello 2

word 1

[hadoop@master~]$


到这里hadoopeclipse开会工具已经安装成功。



在此过程中遇到了各类问题:


比如:

1、hadoopbuild-contrib.xml 467或者其他编码这个都是配置文件的问题,注意修改时的空格,其他字符等

2、点击右键没有出现NewHadoop Location 是因为MANIFEST.MF jar文件的格式等问题。

3、以及编译失败时都会显示BUILDFAILED 下面紧接着会说明失败原因,根据不同原因解决问题等;

4、hadoop 编译后版本不同一,如:我eclipse安装在master节点上,这样会出现重启之后datanode节点都没有启动起来,这里需要把编译后的版本复制到各个节点上,然后在重新hadoop dfsadmin -format,但这样会丢掉数据。在这里我都是测试所以数据不中要。或者在各个节点上运行 hadoop datanode -upgrade 。