开发完成的jar包需要上传到集群并使用相应命令才能执行,这对不熟悉Linux的用户仍具有一定困难,而使用hadoop的eclipse插件就能很好的解决这个问题。Hadoop Eclipse插件不仅能让用户直接的在本地(windows下)提交任务到hadoop集群上,还能调试代码、查看出错信息和结果、使用图形化的方式管理HDFS文件。
(1)获取和使用Eclipse插件
我使用的是hadoop-2.5.1,所以使用的插件版本为:hadoop-eclipse-plugin-2.5.1.jar。需要不同的版本可以在网上自行下载。在eclipse中使用插件非常简单,直接将插件包复制放到eclipse\plugins目录下,再启动Eclipse即可。
(2)配置Eclipse插件
在Map/Reduce Locations中新建:
填写的信息如下:
地址根据自己的master的ip地址来改变,port则一般都是9001、9000
然后点击Finish就完成了。如果出现错误如下:
An internal error occurred during: "Map/Reduce location status updater".
java.lang.NullPointerException
解决办法:
1.检查配置的参数是否正确。
2.忽略这个错误,直接应用DFS Location。
进入hadoop根目录运行如下指令启动:
[root@master Desktop]# cd /home/hadoop/hadoop-2.5.1/
[root@master hadoop-2.5.1]# ./sbin/start-dfs.sh
(3)使用Eclipse插件管理HDFS
(4)在Eclipse中提交任务到Hadoop
右击CountJob.java,选择Run As->RunConfiguations。在左侧的JavaApplication新建个配置CountJob,在Arguments中把输入和输出位置填好了。
然后到/home/hadoop/hadoop-2.5.1/etc/hadoop中,复制core-site.xml,hdfs-site.xml,log4j.properties三个文件(直接选中三个文件,按下Ctrl-C),然后右击Eclipse左侧中的ipaddress_browser_count->src选Paste(或者单击后,按下Ctrl-V),最后效果如下:
最后,右击CountJob.java,选择Run As-> Run on Hadoop。成功后下面的console会有一堆输出,和用命令窗口上传jar包输出一样,如果出现了map 100%reduce 100%,说明成功了。
18/04/26 22:01:31 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
18/04/26 22:01:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
18/04/26 22:01:32 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
18/04/26 22:01:32 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String).
18/04/26 22:01:32 INFO input.FileInputFormat: Total input paths to process : 1
18/04/26 22:01:32 INFO mapreduce.JobSubmitter: number of splits:2
18/04/26 22:01:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local616959303_0001
18/04/26 22:01:33 WARN conf.Configuration: file:/home/hadoop/mydata/mapred/staging/root616959303/.staging/job_local616959303_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
18/04/26 22:01:33 WARN conf.Configuration: file:/home/hadoop/mydata/mapred/staging/root616959303/.staging/job_local616959303_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
18/04/26 22:01:33 WARN conf.Configuration: file:/home/hadoop/mydata/mapred/local/localRunner/root/job_local616959303_0001/job_local616959303_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
18/04/26 22:01:33 WARN conf.Configuration: file:/home/hadoop/mydata/mapred/local/localRunner/root/job_local616959303_0001/job_local616959303_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
18/04/26 22:01:33 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
18/04/26 22:01:33 INFO mapreduce.Job: Running job: job_local616959303_0001
18/04/26 22:01:33 INFO mapred.LocalJobRunner: OutputCommitter set in config null
18/04/26 22:01:33 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
18/04/26 22:01:33 INFO mapred.LocalJobRunner: Waiting for map tasks
18/04/26 22:01:33 INFO mapred.LocalJobRunner: Starting task: attempt_local616959303_0001_m_000000_0
18/04/26 22:01:33 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
18/04/26 22:01:33 INFO mapred.MapTask: Processing split: hdfs://master:9000/user/hadoop/input/ssl.www.shuatibei.com_access.log:0+134217728
18/04/26 22:01:33 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/04/26 22:01:34 INFO mapreduce.Job: Job job_local616959303_0001 running in uber mode : false
18/04/26 22:01:34 INFO mapreduce.Job: map 0% reduce 0%
18/04/26 22:01:34 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/04/26 22:01:34 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/04/26 22:01:34 INFO mapred.MapTask: soft limit at 83886080
18/04/26 22:01:34 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/04/26 22:01:34 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/04/26 22:01:39 INFO mapred.LocalJobRunner:
18/04/26 22:01:39 INFO mapred.MapTask: Starting flush of map output
18/04/26 22:01:39 INFO mapred.MapTask: Spilling map output
18/04/26 22:01:39 INFO mapred.MapTask: bufstart = 0; bufend = 14931004; bufvoid = 104857600
18/04/26 22:01:39 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 22084680(88338720); length = 4129717/6553600
18/04/26 22:01:40 INFO mapred.MapTask: Finished spill 0
18/04/26 22:01:40 INFO mapred.Task: Task:attempt_local616959303_0001_m_000000_0 is done. And is in the process of committing
18/04/26 22:01:40 INFO mapred.LocalJobRunner: map
18/04/26 22:01:40 INFO mapred.Task: Task 'attempt_local616959303_0001_m_000000_0' done.
18/04/26 22:01:40 INFO mapred.LocalJobRunner: Finishing task: attempt_local616959303_0001_m_000000_0
18/04/26 22:01:40 INFO mapred.LocalJobRunner: Starting task: attempt_local616959303_0001_m_000001_0
18/04/26 22:01:40 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
18/04/26 22:01:40 INFO mapred.MapTask: Processing split: hdfs://master:9000/user/hadoop/input/ssl.www.shuatibei.com_access.log:134217728+32687387
18/04/26 22:01:40 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
18/04/26 22:01:40 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
18/04/26 22:01:40 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
18/04/26 22:01:40 INFO mapred.MapTask: soft limit at 83886080
18/04/26 22:01:40 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
18/04/26 22:01:40 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
18/04/26 22:01:41 INFO mapreduce.Job: map 100% reduce 0%
18/04/26 22:01:41 INFO mapred.LocalJobRunner:
18/04/26 22:01:41 INFO mapred.MapTask: Starting flush of map output
18/04/26 22:01:41 INFO mapred.MapTask: Spilling map output
18/04/26 22:01:41 INFO mapred.MapTask: bufstart = 0; bufend = 3492714; bufvoid = 104857600
18/04/26 22:01:41 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 25235344(100941376); length = 979053/6553600
18/04/26 22:01:42 INFO mapred.MapTask: Finished spill 0
18/04/26 22:01:42 INFO mapred.Task: Task:attempt_local616959303_0001_m_000001_0 is done. And is in the process of committing
18/04/26 22:01:42 INFO mapred.LocalJobRunner: map
18/04/26 22:01:42 INFO mapred.Task: Task 'attempt_local616959303_0001_m_000001_0' done.
18/04/26 22:01:42 INFO mapred.LocalJobRunner: Finishing task: attempt_local616959303_0001_m_000001_0
18/04/26 22:01:42 INFO mapred.LocalJobRunner: map task executor complete.
18/04/26 22:01:42 INFO mapred.LocalJobRunner: Waiting for reduce tasks
18/04/26 22:01:42 INFO mapred.LocalJobRunner: Starting task: attempt_local616959303_0001_r_000000_0
18/04/26 22:01:42 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
18/04/26 22:01:42 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5bef7247
18/04/26 22:01:42 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=174555136, maxSingleShuffleLimit=43638784, mergeThreshold=115206392, ioSortFactor=10, memToMemMergeOutputsThreshold=10
18/04/26 22:01:42 INFO reduce.EventFetcher: attempt_local616959303_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
18/04/26 22:01:42 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local616959303_0001_m_000001_0 decomp: 3982244 len: 3982248 to MEMORY
18/04/26 22:01:42 INFO reduce.InMemoryMapOutput: Read 3982244 bytes from map-output for attempt_local616959303_0001_m_000001_0
18/04/26 22:01:42 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 3982244, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->3982244
18/04/26 22:01:42 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local616959303_0001_m_000000_0 decomp: 16995866 len: 16995870 to MEMORY
18/04/26 22:01:43 INFO reduce.InMemoryMapOutput: Read 16995866 bytes from map-output for attempt_local616959303_0001_m_000000_0
18/04/26 22:01:43 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 16995866, inMemoryMapOutputs.size() -> 2, commitMemory -> 3982244, usedMemory ->20978110
18/04/26 22:01:43 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
18/04/26 22:01:43 INFO mapred.LocalJobRunner: 2 / 2 copied.
18/04/26 22:01:43 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
18/04/26 22:01:43 INFO mapred.Merger: Merging 2 sorted segments
18/04/26 22:01:43 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 20978080 bytes
18/04/26 22:01:44 INFO reduce.MergeManagerImpl: Merged 2 segments, 20978110 bytes to disk to satisfy reduce memory limit
18/04/26 22:01:44 INFO reduce.MergeManagerImpl: Merging 1 files, 20978112 bytes from disk
18/04/26 22:01:44 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
18/04/26 22:01:44 INFO mapred.Merger: Merging 1 sorted segments
18/04/26 22:01:44 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 20978093 bytes
18/04/26 22:01:44 INFO mapred.LocalJobRunner: 2 / 2 copied.
18/04/26 22:01:44 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
18/04/26 22:01:46 INFO mapred.Task: Task:attempt_local616959303_0001_r_000000_0 is done. And is in the process of committing
18/04/26 22:01:46 INFO mapred.LocalJobRunner: 2 / 2 copied.
18/04/26 22:01:46 INFO mapred.Task: Task attempt_local616959303_0001_r_000000_0 is allowed to commit now
18/04/26 22:01:46 INFO output.FileOutputCommitter: Saved output of task 'attempt_local616959303_0001_r_000000_0' to hdfs://master:9000/user/hadoop/output5/_temporary/0/task_local616959303_0001_r_000000
18/04/26 22:01:46 INFO mapred.LocalJobRunner: reduce > reduce
18/04/26 22:01:46 INFO mapred.Task: Task 'attempt_local616959303_0001_r_000000_0' done.
18/04/26 22:01:46 INFO mapred.LocalJobRunner: Finishing task: attempt_local616959303_0001_r_000000_0
18/04/26 22:01:46 INFO mapred.LocalJobRunner: reduce task executor complete.
18/04/26 22:01:46 INFO mapreduce.Job: map 100% reduce 100%
18/04/26 22:01:46 INFO mapreduce.Job: Job job_local616959303_0001 completed successfully
18/04/26 22:01:47 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=41957888
FILE: Number of bytes written=80619478
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=468040246
HDFS: Number of bytes written=45692
HDFS: Number of read operations=25
HDFS: Number of large read operations=0
HDFS: Number of write operations=5
Map-Reduce Framework
Map input records=640282
Map output records=1277194
Map output bytes=18423718
Map output materialized bytes=20978118
Input split bytes=268
Combine input records=0
Combine output records=0
Reduce input groups=2677
Reduce shuffle bytes=20978118
Reduce input records=1277194
Reduce output records=2677
Spilled Records=2554388
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=388
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=455946240
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=166909211
File Output Format Counters
Bytes Written=45692