官网下载的hadoop源码不是eclipse工程,需要重新编译
本机系统:ubuntu
1.安装工具
使用sudo apt-get install依次安装ant,libtool,autoconf,automaker,openssh-server
2.编译
打开终端,进入hadoop解压的目录,然后分别执行ant clean,ant eclipse verbose,编译成功后如下图所示
3.导入hadoop
import选择existing project,如果出现unbound class path variable ANT_HOME=...,需要在build path添加ANT_HOME变量路径,此后如果出现com.sun.tools..cannot be resolved就需要在build path把jdk/lib/tools.jar添加进来,此时已无错误信息如下图所示
4.安装hadoop
1.配置conf目录下的haddoop-env.sh,core-site.xml,mapred-site.xml,hdfs-site.xml这四个文件
haddoop-env.sh:
export JAVA_HOME=jdk安装目录
core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>指定存放的目录路径</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
hfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
2.设置免密码登录,依次执行命令:ssh-keygen -t rss cd ~/.ssh/ cat id_rsa.pub >> authorized_keys
3.格式namenode:hadoop namenode -format(需要把hadoop/lib添加到环境变量,或者cd到hadoop/lib目录)
4.在conf的haddoop-env.sh文件添加如下内容(每次调试只能使其中一个有效):
#debug hadoop
#HADOOP_NAMENODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y"
#HADOOP_SECONDARYNAMENODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8789,server=y,suspend=y"
#HADOOP_DATANODE_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8790,server=y,suspend=y"
#HADOOP_BALANCER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8791,server=y,suspend=y"
#HADOOP_JOBTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8792,server=y,suspend=y"
#HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=8793,server=y,suspend=y"
#debug hadoop
现以调试jobtracker为例:
1.去掉前一步添加的#HADOOP_JOBTRACKER_OPTS井号,使其有效。打开终端,运行 start-all.sh,启动hadoop,可以看到其中的一行信息是:Listening for transport dt_socket at address: 8792,表明此时namenode处于监听状态
2.设置断电,找到namenode相关的代码,并在其中某行设置断点,如org.apache.hadoop.mapred下的JobTracker.java中的
this.clock = clock;打上断点,然后RUN->DEBUG,程序就会进入调试模式,如下图所示: