intellij IDEA下远程调试hadoop

本文介绍了如何在IntelliJ IDEA中进行远程调试Hadoop的详细步骤,包括配置SSH、编写脚本、调整pom.xml设置,以及准备HDFS文件和查看执行结果。
摘要由CSDN通过智能技术生成

转载请注明出处,来源地址:http://blog.csdn.net/lastsweetop/article/details/8964520

1.前言

Google I/O 2013开发者大会上被android studio震撼,没想到intellij IDEA变的如此强大,我一直是eclipse
的忠实粉丝,但已经为intellij IDEA折服,果断下载安装调试,确实很给力,但居然没有hadoop插件,这点
有点小郁闷,因我最近正在研究hadoop,于是决定自己实现远程调试,代码全部内容托管在github上。
项目使用maven管理,如果对maven不是很熟悉可以看下我的专栏

工程目录如下:

2.第一步:配置ssh

这些配置网上已经一堆了,我这里简单描述一下
执行ssh-keygen -t rsa
keygen -t rsa
会在~/.ssh/id_rsa.pub 文件
将此文件通过scp远程拷贝到namenode节点
scp ~/.ssh/id_rsa.pub hadoop@namenode:~/.ssh/
登陆到namenode
ssh hadoop@namenode
将开发环境的id_rsa.pub文件拷贝到authorized_keys下
cat ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys 
ssh无密码登陆已经完成

3.第二步:编写脚本

deploy.sh脚本
#!/bin/sh
echo "deploy jar"
scp ../target/styhadoop-ch2-1.0.0-SNAPSHOT.jar hadoop@namenode:~/test/
echo "deploy run.sh"
scp run.sh hadoop@namenode:~/test/
echo "change authority"
ssh hadoop@namenode "chmod 755 ~/test/run.sh"
echo "start run.sh"
ssh hadoop@namenode "~/test/run.sh"
run.sh脚本
#!/bin/sh
echo "add jar to classpath"
export HADOOP_CLASSPATH=~/test/styhadoop-ch2-1.0.0-SNAPSHOT.jar
echo "run hadoop task"
~/hadoop/bin/hadoop com.sweetop.styhadoop.MaxTemperature   input/  output/

4.第三步:配置pom.xml

使用maven-antrun-plugin执行脚本,将其绑定再verify生命周期
<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-antrun-plugin</artifactId>
                <version>1.7</version>
                <executions>
                    <execution>
                        <id>hadoop remote run</id>
                        <phase>verify</phase>
                        <goals>
                            <goal>run</goal>
                        </goals>
                        <configuration>
                            <target name="test">
                                <exec dir="${basedir}/shell" executable="bash">
                                     <arg value="deploy.sh"></arg>
                                </exec>
                            </target>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

5.hdfs文件准备

[hadoop@namenode test]$hadoop fs -mkdir /user
[hadoop@namenode test]$hadoop fs -mkdir /user/hadoop/
[hadoop@namenode test]$hadoop fs -put input /user/hadoop/
[hadoop@namenode test]$hadoop fs -lsr /usr/hadoop

6.执行结果

test:
     [exec] deploy jar
     [exec] deploy run.sh
     [exec] change authority
     [exec] start run.sh
     [exec] add jar to classpath
     [exec] run hadoop task
     [exec] 13/05/23 11:36:28 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
     [exec] 13/05/23 11:36:28 INFO input.FileInputFormat: Total input paths to process : 2
     [exec] 13/05/23 11:36:28 INFO util.NativeCodeLoader: Loaded the native-hadoop library
     [exec] 13/05/23 11:36:28 WARN snappy.LoadSnappy: Snappy native library not loaded
     [exec] 13/05/23 11:36:29 INFO mapred.JobClient: Running job: job_201305032210_0003
     [exec] 13/05/23 11:36:30 INFO mapred.JobClient:  map 0% reduce 0%
     [exec] 13/05/23 11:36:46 INFO mapred.JobClient:  map 100% reduce 0%
     [exec] 13/05/23 11:37:04 INFO mapred.JobClient:  map 100% reduce 100%
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient: Job complete: job_201305032210_0003
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient: Counters: 29
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   Job Counters 
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Launched reduce tasks=1
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=19771
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Launched map tasks=2
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Data-local map tasks=2
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=13494
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   File Output Format Counters 
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Bytes Written=8
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   FileSystemCounters
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     FILE_BYTES_READ=131296
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     HDFS_BYTES_READ=1777394
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=327106
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=8
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   File Input Format Counters 
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Bytes Read=1777168
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:   Map-Reduce Framework
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output materialized bytes=131302
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map input records=13130
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce shuffle bytes=65656
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Spilled Records=26258
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output bytes=105032
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     CPU time spent (ms)=6030
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Total committed heap usage (bytes)=379518976
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Combine input records=0
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     SPLIT_RAW_BYTES=226
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce input records=13129
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce input groups=1
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Combine output records=0
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Physical memory (bytes) snapshot=469196800
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Reduce output records=1
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1723944960
     [exec] 13/05/23 11:37:09 INFO mapred.JobClient:     Map output records=13129



评论 12
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值