hadoop.terasort测试

硬件配置:
node configuration: 2*4-core 16GB-ram 4*1T-storage
node number: 11
软件配置(其他是默认设置):
replication: 1
---------------------------------
测试过程中调节的参数:
mapred.tasktracker.map.tasks.maximum=4(共八个cores, 留一个给datanode和tasktracker使用)
mapred.tasktracker.reduce.tasks.maximum=3
---------------------------------


测试性能的参数:
调节文件块大小:64MB->128MB
调节:
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>The default number of map tasks per job.
Ignored when mapred.job.tracker is "local". 
</description>
</property>


[bin/hadoop fs -rmr terasort/input-GB001]
bin/hadoop jar hadoop-0.20.2-examples.jar teragen 100 00000 terasort/input-GB001

Generating 10000000 using 2 maps with step of 5000000
10/07/27 12:27:39 INFO mapred.JobClient: Running job: job_201007271223_0003
10/07/27 12:27:40 INFO mapred.JobClient:  map 0% reduce 0%
10/07/27 12:27:54 INFO mapred.JobClient:  map 53% reduce 0%
10/07/27 12:28:00 INFO mapred.JobClient:  map 100% reduce 0%
10/07/27 12:28:02 INFO mapred.JobClient: Job complete: job_201007271223_0003
10/07/27 12:28:02 INFO mapred.JobClient: Counters: 6
10/07/27 12:28:02 INFO mapred.JobClient:   Job Counters
10/07/27 12:28:02 INFO mapred.JobClient:     Launched map tasks=2
10/07/27 12:28:02 INFO mapred.JobClient:   FileSystemCounters
10/07/27 12:28:02 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
10/07/27 12:28:02 INFO mapred.JobClient:   Map-Reduce Framework
10/07/27 12:28:02 INFO mapred.JobClient:     Map input records=10000000
10/07/27 12:28:02 INFO mapred.JobClient:     Spilled Records=0
10/07/27 12:28:02 INFO mapred.JobClient:     Map input bytes=10000000
10/07/27 12:28:02 INFO mapred.JobClient:     Map output records=10000000
tersgen测试:
hadoop jar hadoop/hadoop-*-examples.jar teragen              10 terasort/input-KB001
15s

hadoop jar hadoop/hadoop-*-examples.jar teragen        10000 terasort/input-MB001
13s

hadoop jar hadoop/hadoop-*-examples.jar teragen  10000000 terasort/input-GB001
22s

hadoop jar hadoop/hadoop-*-examples.jar teragen  20000000 terasort/input-GB002
34s

hadoop jar hadoop/hadoop-*-examples.jar teragen  30000000 terasort/input-GB003
46s

hadoop jar hadoop/hadoop-*-examples.jar teragen  40000000 terasort/input-GB004
55s

hadoop jar hadoop/hadoop-*-examples.jar teragen  50000000 terasort/input-GB005
70s

hadoop jar hadoop/hadoop-*-examples.jar teragen 100000000 terasort/input-GB010
122s(mapred.map.tasks=02)
066s(mapred.map.tasks=04)
048s(mapred.map.tasks=06)
045s(mapred.map.tasks=08)
041s(mapred.map.tasks=09)
038s(mapred.map.tasks=10)
034s(mapred.map.tasks=11)Node number
034s(mapred.map.tasks=12)
034s(mapred.map.tasks=13)
030s(mapred.map.tasks=14)
030s(mapred.map.tasks=15)
030s(mapred.map.tasks=16)
028s(mapred.map.tasks=20)
028s(mapred.map.tasks=22)2CPU*11Node=22CPU
028s(mapred.map.tasks=23)
028s(mapred.map.tasks=24)
028s(mapred.map.tasks=25)
028 ±1 s(mapred.map.tasks=26)
028 ±1 s(mapred.map.tasks=27)
028 ±1 s(mapred.map.tasks=28)
028 ±1 s(mapred.map.tasks=28)
028 ±1 s(mapred.map.tasks=28)
028s(mapred.map.tasks=30)
029s(mapred.map.tasks=35)
030 ±1 s(mapred.map.tasks=44) available map number=4Map*11Node
043s(mapred.map.tasks=100)
067s(mapred.map.tasks=200)

------------------------------------------------------------------------------------
bin/hadoop fs -cat terasort/input-GB001/part-00000
.t^#\|v$2\         0AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDDEEEEEEEEEEFFFFFFFFFFGGGGGGGGGGHHHHHHHH
75@~?'WdUF    1IIIIIIIIIIJJJJJJJJJJKKKKKKKKKKLLLLLLLLLLMMMMMMMMMMNNNNNNNNNNOOOOOOOOOOPPPPPPPP
w[o||:N&H,        2QQQQQQQQQQRRRRRRRRRRSSSSSSSSSSTTTTTTTTTTUUUUUUUUUUVVVVVVVVVVWWWWWWWWWWXXXXXXXX

------------------------------------------------------------------------------------
bin/hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input-GB001 terasort/output-GB001
10/07/27 00:11:05 INFO terasort.TeraSort: starting
10/07/27 00:11:05 INFO mapred.FileInputFormat: Total input paths to process : 2
10/07/27 00:11:06 INFO util.NativeCodeLoader: Loaded the native-hadoop library
10/07/27 00:11:06 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
10/07/27 00:11:06 INFO compress.CodecPool: Got brand-new compressor
Making 1 from 100000 records
Step size is 100000.0
10/07/27 00:11:06 INFO mapred.JobClient: Running job: job_201007270004_0003
10/07/27 00:11:07 INFO mapred.JobClient:  map 0% reduce 0%
10/07/27 00:11:21 INFO mapred.JobClient:  map 50% reduce 0%
10/07/27 00:11:24 INFO mapred.JobClient:  map 100% reduce 0%
10/07/27 00:11:33 INFO mapred.JobClient:  map 100% reduce 14%
10/07/27 00:11:36 INFO mapred.JobClient:  map 100% reduce 25%
10/07/27 00:11:39 INFO mapred.JobClient:  map 100% reduce 33%
10/07/27 00:11:54 INFO mapred.JobClient:  map 100% reduce 69%
10/07/27 00:11:57 INFO mapred.JobClient:  map 100% reduce 74%
10/07/27 00:12:00 INFO mapred.JobClient:  map 100% reduce 79%
10/07/27 00:12:03 INFO mapred.JobClient:  map 100% reduce 83%
10/07/27 00:12:06 INFO mapred.JobClient:  map 100% reduce 88%
10/07/27 00:12:09 INFO mapred.JobClient:  map 100% reduce 93%
10/07/27 00:12:15 INFO mapred.JobClient:  map 100% reduce 100%
10/07/27 00:12:17 INFO mapred.JobClient: Job complete: job_201007270004_0003
10/07/27 00:12:17 INFO mapred.JobClient: Counters: 19
10/07/27 00:12:17 INFO mapred.JobClient:   Job Counters
10/07/27 00:12:17 INFO mapred.JobClient:     Launched reduce tasks=1
10/07/27 00:12:17 INFO mapred.JobClient:     Rack-local map tasks=4
10/07/27 00:12:17 INFO mapred.JobClient:     Launched map tasks=16
10/07/27 00:12:17 INFO mapred.JobClient:     Data-local map tasks=12
10/07/27 00:12:17 INFO mapred.JobClient:   FileSystemCounters
10/07/27 00:12:17 INFO mapred.JobClient:     FILE_BYTES_READ=2382257412
10/07/27 00:12:17 INFO mapred.JobClient:     HDFS_BYTES_READ=1000057358
10/07/27 00:12:17 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=3402255956
10/07/27 00:12:17 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
10/07/27 00:12:17 INFO mapred.JobClient:   Map-Reduce Framework
10/07/27 00:12:17 INFO mapred.JobClient:     Reduce input groups=10000000
10/07/27 00:12:17 INFO mapred.JobClient:     Combine output records=0
10/07/27 00:12:17 INFO mapred.JobClient:     Map input records=10000000
10/07/27 00:12:17 INFO mapred.JobClient:     Reduce shuffle bytes=951549114
10/07/27 00:12:17 INFO mapred.JobClient:     Reduce output records=10000000
10/07/27 00:12:17 INFO mapred.JobClient:     Spilled Records=33355441
10/07/27 00:12:17 INFO mapred.JobClient:     Map output bytes=1000000000
10/07/27 00:12:17 INFO mapred.JobClient:     Map input bytes=1000000000
10/07/27 00:12:17 INFO mapred.JobClient:     Combine input records=0
10/07/27 00:12:17 INFO mapred.JobClient:     Map output records=10000000
10/07/27 00:12:17 INFO mapred.JobClient:     Reduce input records=10000000
10/07/27 00:12:17 INFO terasort.TeraSort: done


hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~KB001 terasort/output~KB001
22s(2个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~MB001 terasort/output~MB001
22s(
2个map 因为是批处理,所以省去了网络连接的1s)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB001 terasort/output~GB001
76s=22s+54s (16个map )

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB002  terasort/output~GB002
136s=22s+114s (30个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB003  terasort/output~GB003
187s=22s+165s (46个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB004  terasort/output~GB004
250s=22s+228s (60个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB005  terasort/output~GB005
307s=22s+285s (76个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB010 terasort/output~GB010
793s=22s+771s (150个map)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值