HDFS benchmark 基准测试

一. Hadoop基准测试

Hadoop自带了几个基准测试,被打包在几个jar包中。本文主要是cloudera版本测试
[hsu@server01 ~]$ ls /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop* | egrep "examples|test"
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-examples-2.5.0-mr1-cdh5.2.0.jar
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-examples.jar
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test-2.5.0-mr1-cdh5.2.0.jar
/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test.jar


(1)、Hadoop Test
当不带参数调用hadoop-test-0.20.2-cdh3u3.jar时,会列出所有的测试程序:
 [hsu@server01 ~]$ sudo hadoop jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test.jar 
 An example program must be given as the first argument.
 Valid program names are:
 DFSCIOTest: Distributed i/o benchmark of libhdfs.
 DistributedFSCheck: Distributed checkup of the file system consistency.
 MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
 TestDFSIO: Distributed i/o benchmark.
 dfsthroughput: measure hdfs throughput
 filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
 loadgen: Generic map/reduce load generator
 mapredtest: A map/reduce test check.
 minicluster: Single process HDFS and MR cluster.
 mrbench: A map/reduce benchmark that can create many small jobs
 nnbench: A benchmark that stresses the namenode.
 testarrayfile: A test for flat files of binary key/value pairs.
 testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
 testfilesystem: A test for FileSystem read/write.
 testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
 testrpc: A test for rpc.
 testsequencefile: A test for flat files of binary key value pairs.
 testsequencefileinputformat: A test for sequence file input format.
 testsetfile: A test for flat files of binary key/value pairs.
 testtextinputformat: A test for text input format.
 threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill


 这些程序从多个角度对Hadoop进行测试,TestDFSIO、mrbench和nnbench是三个广泛被使用的测试。


(2) TestDFSIO write


TestDFSIO用于测试HDFS的IO性能,使用一个MapReduce作业来并发地执行读写操作,每个map任务用于读或写每个文件,map的输出用于收集与处理文件相关的统计信息,reduce用于累积统计信息,并产生summary。TestDFSIO的用法如下:
TestDFSIO
Usage: TestDFSIO [genericOptions] -read | -write | -append | -clean [-nrFiles N] [-fileSize Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]


以下的例子将往HDFS中写入10个1000MB的文件:
[hsu@server01 ~]$ sudo hadoop jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
15/01/13 15:14:17 INFO fs.TestDFSIO: TestDFSIO.1.7
15/01/13 15:14:17 INFO fs.TestDFSIO: nrFiles = 10
15/01/13 15:14:17 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
15/01/13 15:14:17 INFO fs.TestDFSIO: bufferSize = 1000000
15/01/13 15:14:17 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
15/01/13 15:14:18 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
15/01/13 15:14:19 INFO fs.TestDFSIO: created control files for: 10 files
15/01/13 15:15:23 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
15/01/13 15:15:23 INFO fs.TestDFSIO:            Date & time: Tue Jan 13 15:15:23 CST 2015
15/01/13 15:15:23 INFO fs.TestDFSIO:        Number of files: 10
15/01/13 15:15:23 INFO fs.TestDFSIO: Total MBytes processed: 10000.0
15/01/13 15:15:23 INFO fs.TestDFSIO:      Throughput mb/sec: 29.67623230554649
15/01/13 15:15:23 INFO fs.TestDFSIO: Average IO rate mb/sec: 29.899526596069336
15/01/13 15:15:23 INFO fs.TestDFSIO:  IO rate std deviation: 2.6268824639446526
15/01/13 15:15:23 INFO fs.TestDFSIO:     Test exec time sec: 64.203
15/01/13 15:15:23 INFO fs.TestDFSIO: 


(3) TestDFSIO read
以下的例子将从HDFS中读取10个1000MB的文件:
[hsu@server01 ~]$ sudo hadoop jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -read -nrFiles 10 -fileSize 1000
15/01/13 15:42:35 INFO fs.TestDFSIO: TestDFSIO.1.7
15/01/13 15:42:35 INFO fs.TestDFSIO: nrFiles = 10
15/01/13 15:42:35 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0
15/01/13 15:42:35 INFO fs.TestDFSIO: bufferSize = 1000000
15/01/13 15:42:35 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
15/01/13 15:42:36 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files
15/01/13 15:42:37 INFO fs.TestDFSIO: created control files for: 10 files


(4) 清空测试数据
[hsu@server01 ~]$ sudo hadoop jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop-0.20-mapreduce/hadoop-test.jar TestDFSIO -clean
15/01/13 15:46:51 INFO fs.TestDFSIO: TestDFSIO.1.7
15/01/13 15:46:51 INFO fs.TestDFSIO: nrFiles = 1
15/01/13 15:46:51 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
15/01/13 15:46:51 INFO fs.TestDFSIO: bufferSize = 1000000
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值