hadoop的发行版本中附带了几个基准测试,可以用来验证hadoop以及评估hadoop的性能。以运行排序基准为例,首先我们使用hadoop作业randomwrite生成一些随机数,然后使用排序实例对它进行排序。
1.命令hadoop@master:/usr/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar randomwriter -D test.randomwrite.bytes_per_map=100 -D test.randomwrite.maps_per_host=10 /data/unsorted-data
15/12/08 10:59:16 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
Running 20 maps.
Job started: Tue Dec 08 10:59:17 CST 2015
15/12/08 10:59:17 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
15/12/08 10:59:18 INFO mapreduce.JobSubmitter: number of splits:20
15/12/08 10:59:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449455612935_0004
15/12/08 10:59:19 INFO impl.YarnClientImpl: Submitted application application_1449455612935_0004
15/12/08 10:59:19 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1449455612935_0004/
15/12/08 10:59:19 INFO mapreduce.Job: Running job: job_1449455612935_0004
15/12/08 10:59:31 INFO mapreduce.Job: Job job_1449455612935_0004 running in uber mode : false
15/12/08 11:14:06 INFO mapreduce.Job: map 0% reduce 0%
15/12/08 11:16:50 INFO mapreduce.Job: map 5% reduce 0%
15/12/08 11:16:54 INFO mapreduce.Job: map 10% reduce 0%
15/12/08 11:17:34 INFO mapreduce.Job: map 15% reduce 0%
15/12/08 11:17:37 INFO mapreduce.Job: map 20% reduce 0%
15/12/08 11:17:58 INFO mapreduce.Job: map 25% reduce 0%
15/12/08 11:18:00 INFO mapreduce.Job: map 30% reduce 0%
15/12/08 11:19:23 INFO mapreduce.Job: map 35% reduce 0%
15/12/08 11:19:53 INFO mapreduce.Job: map 40% reduce 0%
15/12/08 11:31:28 INFO mapreduce.Job: map 50% reduce 0%
15/12/08 11:31:29 INFO mapreduce.Job: map 55% reduce 0%
15/12/08 11:31:36 INFO mapreduce.Job: map 60% reduce 0%
15/12/08 11:31:41 INFO mapreduce.Job: map 65% reduce 0%
15/12/08 11:31:47 INFO mapreduce.Job: map 70% reduce 0%
15/12/08 11:31:55 INFO mapreduce.Job: map 75% reduce 0%
15/12/08 11:31:59 INFO mapreduce.Job: map 80% reduce 0%
15/12/08 11:32:11 INFO mapreduce.Job: map 85% reduce 0%
15/12/08 11:32:13 INFO mapreduce.Job: map 90% reduce 0%
15/12/08 11:32:21 INFO mapreduce.Job: map 95% reduce 0%
15/12/08 11:32:23 INFO mapreduce.Job: map 100% reduce 0%
15/12/08 11:32:35 INFO mapreduce.Job: Job job_1449455612935_0004 completed successfully
15/12/08 11:32:35 INFO mapreduce.Job: Counters: 34
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=2117490
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2450
HDFS: Number of bytes written=21545740123
HDFS: Number of read operations=80
HDFS: Number of large read operations=0
HDFS: Number of write operations=40
Job Counters
Failed map tasks=6
Killed map tasks=5
Launched map tasks=33
Other local map tasks=33
Total time spent by all maps in occupied slots (ms)=25728885
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=25728885
Total vcore-seconds taken by all map tasks=25728885
Total megabyte-seconds taken by all map tasks=26346378240
Map-Reduce Framework
Map input records=20
Map output records=2045013
Input split bytes=2450
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=303154
CPU time spent (ms)=1186310
Physical memory (bytes) snapshot=1977659392
Virtual memory (bytes) snapshot=11327254528
Total committed heap usage (bytes)=971767808
org.apache.hadoop.examples.RandomWriter$Counters
BYTES_WRITTEN=21474993235
RECORDS_WRITTEN=2045013
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=21545740123
Job ended: Tue Dec 08 11:32:35 CST 2015
The job took 1998 seconds.
在eclipse中看到的文件如下:
2.#运行排序
hadoop@master:/usr/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar randomwriter sort /data/unsorted-data /data/sorted-data
15/12/08 13:28:03 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
Running 20 maps.
Job started: Tue Dec 08 13:28:04 CST 2015
15/12/08 13:28:04 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
15/12/08 13:28:06 INFO mapreduce.JobSubmitter: number of splits:20
15/12/08 13:28:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449455612935_0005
15/12/08 13:28:07 INFO impl.YarnClientImpl: Submitted application application_1449455612935_0005
15/12/08 13:28:07 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1449455612935_0005/
15/12/08 13:28:07 INFO mapreduce.Job: Running job: job_1449455612935_0005
15/12/08 13:28:27 INFO mapreduce.Job: Job job_1449455612935_0005 running in uber mode : false
15/12/08 13:28:27 INFO mapreduce.Job: map 0% reduce 0%
由于数据量较大,计算时间比较长,这里就没有列出,从控制台可以看出当前的进度:
接近完成:
完整如下:到刚才才完成:
hadoop@master:/usr/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar randomwriter sort /data/unsorted-data /data/sorted-data
15/12/08 13:28:03 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
Running 20 maps.
Job started: Tue Dec 08 13:28:04 CST 2015
15/12/08 13:28:04 INFO client.RMProxy: Connecting to ResourceManager at master/10.28.23.201:8032
15/12/08 13:28:06 INFO mapreduce.JobSubmitter: number of splits:20
15/12/08 13:28:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1449455612935_0005
15/12/08 13:28:07 INFO impl.YarnClientImpl: Submitted application application_1449455612935_0005
15/12/08 13:28:07 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1449455612935_0005/
15/12/08 13:28:07 INFO mapreduce.Job: Running job: job_1449455612935_0005
15/12/08 13:28:27 INFO mapreduce.Job: Job job_1449455612935_0005 running in uber mode : false
15/12/08 13:28:27 INFO mapreduce.Job: map 0% reduce 0%
^[^R
15/12/08 13:50:32 INFO mapreduce.Job: map 5% reduce 0%
15/12/08 13:50:55 INFO mapreduce.Job: map 10% reduce 0%
15/12/08 13:51:03 INFO mapreduce.Job: map 15% reduce 0%
15/12/08 13:51:24 INFO mapreduce.Job: map 25% reduce 0%
15/12/08 13:51:28 INFO mapreduce.Job: map 30% reduce 0%
15/12/08 13:51:37 INFO mapreduce.Job: map 40% reduce 0%
15/12/08 13:51:40 INFO mapreduce.Job: map 50% reduce 0%
15/12/08 13:52:10 INFO mapreduce.Job: map 55% reduce 0%
15/12/08 13:52:33 INFO mapreduce.Job: map 60% reduce 0%
15/12/08 13:52:47 INFO mapreduce.Job: map 65% reduce 0%
15/12/08 13:52:54 INFO mapreduce.Job: map 70% reduce 0%
15/12/08 14:03:31 INFO mapreduce.Job: map 75% reduce 0%
15/12/08 14:03:55 INFO mapreduce.Job: map 80% reduce 0%
15/12/08 14:03:58 INFO mapreduce.Job: map 90% reduce 0%
15/12/08 14:04:56 INFO mapreduce.Job: map 95% reduce 0%
15/12/08 14:05:00 INFO mapreduce.Job: map 100% reduce 0%
15/12/08 14:05:16 INFO mapreduce.Job: Job job_1449455612935_0005 completed successfully
15/12/08 14:05:18 INFO mapreduce.Job: Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=2111750
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2410
HDFS: Number of bytes written=21545673941
HDFS: Number of read operations=80
HDFS: Number of large read operations=0
HDFS: Number of write operations=40
Job Counters
Killed map tasks=10
Launched map tasks=30
Other local map tasks=31
Total time spent by all maps in occupied slots (ms)=28356497
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=28356497
Total vcore-seconds taken by all map tasks=28356497
Total megabyte-seconds taken by all map tasks=29037052928
Map-Reduce Framework
Map input records=20
Map output records=2042654
Input split bytes=2410
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=202859
CPU time spent (ms)=1204660
Physical memory (bytes) snapshot=2063581184
Virtual memory (bytes) snapshot=11312558080
Total committed heap usage (bytes)=980156416
org.apache.hadoop.examples.RandomWriter$Counters
BYTES_WRITTEN=21474993477
RECORDS_WRITTEN=2042654
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=21545673941
Job ended: Tue Dec 08 14:05:18 CST 2015
The job took 2233 seconds.