Hadoop2.7.0学习——基准测试

最新推荐文章于 2024-09-24 09:03:30 发布

-贫寒豌豆

最新推荐文章于 2024-09-24 09:03:30 发布

阅读量5.2k

点赞数

分类专栏： hadoop 文章标签： hadoop 基准测试

本文链接：https://blog.csdn.net/flygoa/article/details/52127382

版权

hadoop 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

Hadoop2.7.0学习——基准测试

Hadoop 带有一些基准测试程序，可以最少的准备成本轻松运行。基准测试被打包在测试程序JAR文件中，通过无参调用JAR文件可以得到其列表
——《Hadoop权威指南》

查看信息

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.ja

执行结果

[root@hadoop-master hadoop-2.7.0]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar 
An example program must be given as the first argument.
Valid program names are:
  DFSCIOTest: Distributed i/o benchmark of libhdfs.
  DistributedFSCheck: Distributed checkup of the file system consistency.
  JHLogAnalyzer: Job History Log analyzer.
  MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
  NNdataGenerator: Generate the data to be used by NNloadGenerator
  NNloadGenerator: Generate load on Namenode using NN loadgenerator run WITHOUT MR
  NNloadGeneratorMR: Generate load on Namenode using NN loadgenerator run as MR job
  NNstructureGenerator: Generate the structure to be used by NNdataGenerator
  SliveTest: HDFS Stress Test and Live Data Verification.
  TestDFSIO: Distributed i/o benchmark.
  fail: a job that always fails
  filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
  largesorter: Large-Sort tester
  loadgen: Generic map/reduce load generator
  mapredtest: A map/reduce test check.
  minicluster: Single process HDFS and MR cluster.
  mrbench: A map/reduce benchmark that can create many small jobs
  nnbench: A benchmark that stresses the namenode.
  sleep: A job that sleeps at each map and reduce task.
  testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
  testfilesystem: A test for FileSystem read/write.
  testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
  testsequencefile: A test for flat files of binary key value pairs.
  testsequencefileinputformat: A test for sequence file input format.
  testtextinputformat: A test for text input format.
  threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill

写入速度测试

测试写入10个10M文件

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 10MB

[root@hadoop-master hadoop-2.7.0]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 10MB
16/08/05 10:11:42 INFO fs.TestDFSIO: TestDFSIO.1.8
16/08/05 10:11:42 INFO fs.TestDFSIO: nrFiles = 10
16/08/05 10:11:42 INFO fs.TestDFSIO: nrBytes (MB) = 10.0
16/08/05 10:11:42 INFO fs.TestDFSIO: bufferSize = 1000000
16/08/05 10:11:42 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
16/08/05 10:11:43 INFO fs.TestDFSIO: creating control file: 10485760 bytes, 10 files
16/08/05 10:11:44 INFO fs.TestDFSIO: created control files for: 10 files
16/08/05 10:11:45 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/192.168.20.141:8032
16/08/05 10:11:45 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/192.168.20.141:8032
16/08/05 10:11:46 INFO mapred.FileInputFormat: Total input paths to process : 10
16/08/05 10:11:46 INFO mapreduce.JobSubmitter: number of splits:10
16/08/05 10:11:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470403413050_0001
16/08/05 10:11:46 INFO impl.YarnClientImpl: Submitted application application_1470403413050_0001
16/08/05 10:11:47 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1470403413050_0001/
16/08/05 10:11:47 INFO mapreduce.Job: Running job: job_1470403413050_0001
16/08/05 10:11:57 INFO mapreduce.Job: Job job_1470403413050_0001 running in uber mode : false
16/08/05 10:11:57 INFO mapreduce.Job:  map 0% reduce 0%
16/08/05 10:12:27 INFO mapreduce.Job:  map 40% reduce 0%
16/08/05 10:12:53 INFO mapreduce.Job:  map 47% reduce 13%
16/08/05 10:12:54 INFO mapreduce.Job:  map 53% reduce 13%
16/08/05 10:13:03 INFO mapreduce.Job:  map 60% reduce 13%
16/08/05 10:13:04 INFO mapreduce.Job:  map 67% reduce 13%
16/08/05 10:13:06 INFO mapreduce.Job:  map 80% reduce 13%
16/08/05 10:13:08 INFO mapreduce.Job:  map 83% reduce 13%
16/08/05 10:13:10 INFO mapreduce.Job:  map 90% reduce 13%
16/08/05 10:13:11 INFO mapreduce.Job:  map 100% reduce 13%
16/08/05 10:13:12 INFO mapreduce.Job:  map 100% reduce 17%
16/08/05 10:13:18 INFO mapreduce.Job:  map 100% reduce 20%
16/08/05 10:13:30 INFO mapreduce.Job:  map 100% reduce 67%
16/08/05 10:13:36 INFO mapreduce.Job:  map 100% reduce 100%
16/08/05 10:13:37 INFO mapreduce.Job: Job job_1470403413050_0001 completed successfully
16/08/05 10:13:37 INFO mapreduce.Job: Counters: 51
    File System Counters
        FILE: Number of bytes read=840
        FILE: Number of bytes written=1269539
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=2390
        HDFS: Number of bytes written=104857676
        HDFS: Number of read operations=43
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=12
    Job Counters 
        Killed map tasks=2
        Launched map tasks=13
        Launched reduce tasks=1
        Data-local map tasks=11
        Rack-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=646035
        Total time spent by all reduces in occupied slots (ms)=66302
        Total time spent by all map tasks (ms)=646035
        Total time spent by all reduce tasks (ms)=66302
        Total vcore-seconds taken by all map tasks=646035
        Total vcore-seconds taken by all reduce tasks=66302
        Total megabyte-seconds taken by all map tasks=661539840
        Total megabyte-seconds taken by all reduce tasks=67893248
    Map-Reduce Framework
        Map input records=10
        Map output records=50
        Map output bytes=734
        Map output materialized bytes=894
        Input split bytes=1270
        Combine input records=0
        Combine output records=0
        Reduce input groups=5
        Reduce shuffle bytes=894
        Reduce input records=50
        Reduce output records=5
        Spilled Records=100
        Shuffled Maps =10
        Failed Shuffles=0
        Merged Map outputs=10
        GC time elapsed (ms)=8218
        CPU time spent (ms)=15840
        Physical memory (bytes) snapshot=2184921088
        Virtual memory (bytes) snapshot=9295441920
        Total committed heap usage (bytes)=1374396416
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=1120
    File Output Format Counters 
        Bytes Written=76
16/08/05 10:13:37 WARN hdfs.DFSClient: DFSInputStream has been closed already
16/08/05 10:13:37 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
16/08/05 10:13:37 INFO fs.TestDFSIO:            Date & time: Fri Aug 05 10:13:37 EDT 2016
16/08/05 10:13:37 INFO fs.TestDFSIO:        Number of files: 10
16/08/05 10:13:37 INFO fs.TestDFSIO: Total MBytes processed: 100.0
16/08/05 10:13:37 INFO fs.TestDFSIO:      Throughput mb/sec: 1.4243796826482067
16/08/05 10:13:37 INFO fs.TestDFSIO: Average IO rate mb/sec: 6.660604000091553
16/08/05 10:13:37 INFO fs.TestDFSIO:  IO rate std deviation: 8.936692949846902
16/08/05 10:13:37 INFO fs.TestDFSIO:     Test exec time sec: 112.884
16/08/05 10:13:37 INFO fs.TestDFSIO:

测试读文件速度

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 10MB

执行结果

[root@hadoop-master hadoop-2.7.0]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 10MB
16/08/05 10:25:55 INFO fs.TestDFSIO: TestDFSIO.1.8
16/08/05 10:25:55 INFO fs.TestDFSIO: nrFiles = 10
16/08/05 10:25:55 INFO fs.TestDFSIO: nrBytes (MB) = 10.0
16/08/05 10:25:55 INFO fs.TestDFSIO: bufferSize = 1000000
16/08/05 10:25:55 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
16/08/05 10:25:56 INFO fs.TestDFSIO: creating control file: 10485760 bytes, 10 files
16/08/05 10:25:57 INFO fs.TestDFSIO: created control files for: 10 files
16/08/05 10:25:57 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/192.168.20.141:8032
16/08/05 10:25:57 INFO client.RMProxy: Connecting to ResourceManager at hadoop-master/192.168.20.141:8032
16/08/05 10:25:58 INFO mapred.FileInputFormat: Total input paths to process : 10
16/08/05 10:25:58 INFO mapreduce.JobSubmitter: number of splits:10
16/08/05 10:25:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1470403413050_0002
16/08/05 10:25:59 INFO impl.YarnClientImpl: Submitted application application_1470403413050_0002
16/08/05 10:25:59 INFO mapreduce.Job: The url to track the job: http://hadoop-master:8088/proxy/application_1470403413050_0002/
16/08/05 10:25:59 INFO mapreduce.Job: Running job: job_1470403413050_0002
16/08/05 10:26:07 INFO mapreduce.Job: Job job_1470403413050_0002 running in uber mode : false
16/08/05 10:26:07 INFO mapreduce.Job:  map 0% reduce 0%
16/08/05 10:26:36 INFO mapreduce.Job:  map 20% reduce 0%
16/08/05 10:26:37 INFO mapreduce.Job:  map 30% reduce 0%
16/08/05 10:26:38 INFO mapreduce.Job:  map 50% reduce 0%
16/08/05 10:26:39 INFO mapreduce.Job:  map 100% reduce 0%
16/08/05 10:26:44 INFO mapreduce.Job:  map 100% reduce 100%
16/08/05 10:26:46 INFO mapreduce.Job: Job job_1470403413050_0002 completed successfully
16/08/05 10:26:46 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=832
        FILE: Number of bytes written=1269501
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=104859990
        HDFS: Number of bytes written=77
        HDFS: Number of read operations=53
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=10
        Launched reduce tasks=1
        Data-local map tasks=10
        Total time spent by all maps in occupied slots (ms)=291173
        Total time spent by all reduces in occupied slots (ms)=5582
        Total time spent by all map tasks (ms)=291173
        Total time spent by all reduce tasks (ms)=5582
        Total vcore-seconds taken by all map tasks=291173
        Total vcore-seconds taken by all reduce tasks=5582
        Total megabyte-seconds taken by all map tasks=298161152
        Total megabyte-seconds taken by all reduce tasks=5715968
    Map-Reduce Framework
        Map input records=10
        Map output records=50
        Map output bytes=726
        Map output materialized bytes=886
        Input split bytes=1270
        Combine input records=0
        Combine output records=0
        Reduce input groups=5
        Reduce shuffle bytes=886
        Reduce input records=50
        Reduce output records=5
        Spilled Records=100
        Shuffled Maps =10
        Failed Shuffles=0
        Merged Map outputs=10
        GC time elapsed (ms)=3177
        CPU time spent (ms)=7570
        Physical memory (bytes) snapshot=2147774464
        Virtual memory (bytes) snapshot=9255862272
        Total committed heap usage (bytes)=1374396416
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=1120
    File Output Format Counters 
        Bytes Written=77
16/08/05 10:26:46 WARN hdfs.DFSClient: DFSInputStream has been closed already
16/08/05 10:26:46 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
16/08/05 10:26:46 INFO fs.TestDFSIO:            Date & time: Fri Aug 05 10:26:46 EDT 2016
16/08/05 10:26:46 INFO fs.TestDFSIO:        Number of files: 10
16/08/05 10:26:46 INFO fs.TestDFSIO: Total MBytes processed: 100.0
16/08/05 10:26:46 INFO fs.TestDFSIO:      Throughput mb/sec: 29.197080291970803
16/08/05 10:26:46 INFO fs.TestDFSIO: Average IO rate mb/sec: 47.35454177856445
16/08/05 10:26:46 INFO fs.TestDFSIO:  IO rate std deviation: 29.781282953365924
16/08/05 10:26:46 INFO fs.TestDFSIO:     Test exec time sec: 48.942
16/08/05 10:26:46 INFO fs.TestDFSIO:

命令查看

安装目录执行

cat TestDFSIO_results.log

执行结果

[root@hadoop-master hadoop-2.7.0]# cat TestDFSIO_results.log 
----- TestDFSIO ----- : write
           Date & time: Fri Aug 05 10:13:37 EDT 2016
       Number of files: 10
Total MBytes processed: 100.0
     Throughput mb/sec: 1.4243796826482067
Average IO rate mb/sec: 6.660604000091553
 IO rate std deviation: 8.936692949846902
    Test exec time sec: 112.884

----- TestDFSIO ----- : read
           Date & time: Fri Aug 05 10:26:46 EDT 2016
       Number of files: 10
Total MBytes processed: 100.0
     Throughput mb/sec: 29.197080291970803
Average IO rate mb/sec: 47.35454177856445
 IO rate std deviation: 29.781282953365924
    Test exec time sec: 48.942

web查看

http://hadoop-master:50070/

清除测试数据

将测试数据删除

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.0-tests.jar TestDFSIO -clean

单词统计测试

新建一个words.txt

vim words.txt

内容

hello hadoop hbase mytest
hadoop-node1
hadoop-master
hadoop-node2
this is my test

上传文件

bin/hadoop fs -put words.txt /tmp/

使用mapreduce统计单词个数
统计指定文件单词个数，并将结果输入到指定文件

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar wordcount /tmp/words.txt /tmp/words_result.txt

结果

hadoop  1
hadoop-master   1
hadoop-node1     1
hadoop-node2     1
hbase   1
hello   1
is  1
my  1
mytest  1
test    1
this    1

代码查看
查看根目录

bin/hadoop fs -ls /

查看指定文件

bin/hadoop fs -cat hdfs:///tmp/words_result.txt/part-r-00000

执行结果

[root@hadoop-master hadoop-2.7.0]# bin/hadoop fs -cat hdfs:///tmp/words_result.txt/part-r-00000
hadoop  1
hadoop-master   1
hadoop-node1    1
hadoop-node2    1
hbase   1
hello   1
is  1
my  1
mytest  1
test    1
this    1