
1. 定位到测试jar包位置


# 定位目录

# 查看目录  找到hadoop-mapreduce-examples-2.10.1.jar测试包
root@hecs-x-large-2-linux-20200618145835:~/dong/program/hadoop-2.10.1/share/hadoop/mapreduce# ll
total 5256
drwxr-xr-x 6 1000 qa    4096 Sep 14 21:39 ./
drwxr-xr-x 9 1000 qa    4096 Sep 14 21:39 ../
-rw-r--r-- 1 1000 qa  586815 Sep 14 21:39 hadoop-mapreduce-client-app-2.10.1.jar
-rw-r--r-- 1 1000 qa  787989 Sep 14 21:39 hadoop-mapreduce-client-common-2.10.1.jar
-rw-r--r-- 1 1000 qa 1613911 Sep 14 21:39 hadoop-mapreduce-client-core-2.10.1.jar
-rw-r--r-- 1 1000 qa  199675 Sep 14 21:39 hadoop-mapreduce-client-hs-2.10.1.jar
-rw-r--r-- 1 1000 qa   32779 Sep 14 21:39 hadoop-mapreduce-client-hs-plugins-2.10.1.jar
-rw-r--r-- 1 1000 qa   72212 Sep 14 21:39 hadoop-mapreduce-client-jobclient-2.10.1.jar
-rw-r--r-- 1 1000 qa 1652223 Sep 14 21:39 hadoop-mapreduce-client-jobclient-2.10.1-tests.jar
-rw-r--r-- 1 1000 qa   84008 Sep 14 21:39 hadoop-mapreduce-client-shuffle-2.10.1.jar
-rw-r--r-- 1 1000 qa  303324 Sep 14 21:39 hadoop-mapreduce-examples-2.10.1.jar
drwxr-xr-x 2 1000 qa    4096 Sep 14 21:39 jdiff/
drwxr-xr-x 2 1000 qa    4096 Sep 14 21:39 lib/
drwxr-xr-x 2 1000 qa    4096 Sep 14 21:39 lib-examples/
drwxr-xr-x 2 1000 qa    4096 Sep 14 21:39 sources/

2. 运行测试包

# 执行jar包 pi为主类  3 为map任务数量  3为map取样数
# hadoop jar hadoop-mapreduce-examples-2.10.1.jar pi 3 3
Number of Maps  = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
21/01/16 11:03:45 INFO client.RMProxy: Connecting to ResourceManager at localhost/
21/01/16 11:03:46 INFO input.FileInputFormat: Total input files to process : 3
21/01/16 11:03:46 INFO mapreduce.JobSubmitter: number of splits:3
21/01/16 11:03:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1610510670587_0001
21/01/16 11:03:47 INFO conf.Configuration: resource-types.xml not found
21/01/16 11:03:47 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
21/01/16 11:03:47 INFO resource.ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE
21/01/16 11:03:47 INFO resource.ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE
21/01/16 11:03:47 INFO impl.YarnClientImpl: Submitted application application_1610510670587_0001
21/01/16 11:03:47 INFO mapreduce.Job: The url to track the job: http://localhost.vm:8088/proxy/application_1610510670587_0001/
21/01/16 11:03:47 INFO mapreduce.Job: Running job: job_1610510670587_0001
21/01/16 11:03:55 INFO mapreduce.Job: Job job_1610510670587_0001 running in uber mode : false
21/01/16 11:03:55 INFO mapreduce.Job:  map 0% reduce 0%
21/01/16 11:04:03 INFO mapreduce.Job:  map 100% reduce 0%
21/01/16 11:04:15 INFO mapreduce.Job:  map 100% reduce 100%
21/01/16 11:04:16 INFO mapreduce.Job: Job job_1610510670587_0001 completed successfully
21/01/16 11:04:17 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=72
                FILE: Number of bytes written=835625
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=792
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=15
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters 
                Launched map tasks=3
                Launched reduce tasks=1
                Data-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=17266
                Total time spent by all reduces in occupied slots (ms)=8882
                Total time spent by all map tasks (ms)=17266
                Total time spent by all reduce tasks (ms)=8882
                Total vcore-milliseconds taken by all map tasks=17266
                Total vcore-milliseconds taken by all reduce tasks=8882
                Total megabyte-milliseconds taken by all map tasks=17680384
                Total megabyte-milliseconds taken by all reduce tasks=9095168
        Map-Reduce Framework
                Map input records=3
                Map output records=6
                Map output bytes=54
                Map output materialized bytes=84
                Input split bytes=438
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=84
                Reduce input records=6
                Reduce output records=0
                Spilled Records=12
                Shuffled Maps =3
                Failed Shuffles=0
                Merged Map outputs=3
                GC time elapsed (ms)=486
                CPU time spent (ms)=2020
                Physical memory (bytes) snapshot=1026621440
                Virtual memory (bytes) snapshot=7678922752
                Total committed heap usage (bytes)=701497344
        Shuffle Errors
        File Input Format Counters 
                Bytes Read=354
        File Output Format Counters 
                Bytes Written=97
Job Finished in 31.276 seconds
Estimated value of Pi is 3.5555555555555555555

3. 发现

得到Hadoop能干什么,先执行一个正常的Hadoop mr例子,从中发现什么?

1. 发现 map任务数可自定义


Number of Maps  = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2

2. 发现 提交任务后处理过程

第二步:FileInputFormat指定由三个input files进程
第三步:JobSubmitter 提交后有三个split

Starting Job
21/01/16 11:03:45 INFO client.RMProxy: Connecting to ResourceManager at localhost/
21/01/16 11:03:46 INFO input.FileInputFormat: Total input files to process : 3
21/01/16 11:03:46 INFO mapreduce.JobSubmitter: number of splits:3
21/01/16 11:03:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1610510670587_0001

3.发现 任务执行流程


21/01/16 11:03:47 INFO impl.YarnClientImpl: Submitted application application_1610510670587_0001
21/01/16 11:03:47 INFO mapreduce.Job: The url to track the job: http://localhost.vm:8088/proxy/application_1610510670587_0001/
21/01/16 11:03:47 INFO mapreduce.Job: Running job: job_1610510670587_0001
21/01/16 11:03:55 INFO mapreduce.Job: Job job_1610510670587_0001 running in uber mode : false
21/01/16 11:03:55 INFO mapreduce.Job:  map 0% reduce 0%
21/01/16 11:04:03 INFO mapreduce.Job:  map 100% reduce 0%
21/01/16 11:04:15 INFO mapreduce.Job:  map 100% reduce 100%
21/01/16 11:04:16 INFO mapreduce.Job: Job job_1610510670587_0001 completed successfully

4. 发现 整个任务从开始到结束有哪些组件参与

1. file system

2. job

3. Map-Reduce

4. Shuffle

5. File Input

6. File output

21/01/16 11:04:17 INFO mapreduce.Job: Counters: 49
        File System Counters
                FILE: Number of bytes read=72
                FILE: Number of bytes written=835625
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=792
                HDFS: Number of bytes written=215
                HDFS: Number of read operations=15
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=3
        Job Counters 
                Launched map tasks=3
                Launched reduce tasks=1
                Data-local map tasks=3
                Total time spent by all maps in occupied slots (ms)=17266
                Total time spent by all reduces in occupied slots (ms)=8882
                Total time spent by all map tasks (ms)=17266
                Total time spent by all reduce tasks (ms)=8882
                Total vcore-milliseconds taken by all map tasks=17266
                Total vcore-milliseconds taken by all reduce tasks=8882
                Total megabyte-milliseconds taken by all map tasks=17680384
                Total megabyte-milliseconds taken by all reduce tasks=9095168
        Map-Reduce Framework
                Map input records=3
                Map output records=6
                Map output bytes=54
                Map output materialized bytes=84
                Input split bytes=438
                Combine input records=0
                Combine output records=0
                Reduce input groups=2
                Reduce shuffle bytes=84
                Reduce input records=6
                Reduce output records=0
                Spilled Records=12
                Shuffled Maps =3
                Failed Shuffles=0
                Merged Map outputs=3
                GC time elapsed (ms)=486
                CPU time spent (ms)=2020
                Physical memory (bytes) snapshot=1026621440
                Virtual memory (bytes) snapshot=7678922752
                Total committed heap usage (bytes)=701497344
        Shuffle Errors
        File Input Format Counters 
                Bytes Read=354
        File Output Format Counters 
                Bytes Written=97
Job Finished in 31.276 seconds




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


