【大数据实验2】note2：Hadoop测试工具

最新推荐文章于 2023-11-02 14:53:59 发布

社恐患者

最新推荐文章于 2023-11-02 14:53:59 发布

阅读量1.9k

点赞数 4

分类专栏：大数据处理与分析

本文链接：https://blog.csdn.net/qq_44714521/article/details/109265442

版权

大数据处理与分析专栏收录该内容

14 篇文章 7 订阅

订阅专栏

note2：Hadoop测试工具

0 查看可用测试工具
1 DFSCIOTest
2 TestDFSIO
3 mrbench
4 nnbench

实验具体操作步骤👉hadoop配置、测试和实例

0 查看可用测试工具

cd $HADOOP_HOME/share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar

[root@localhost mapreduce]# hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar
An example program must be given as the first argument.
Valid program names are:
  DFSCIOTest: Distributed i/o benchmark of libhdfs.
  DistributedFSCheck: Distributed checkup of the file system consistency.
  JHLogAnalyzer: Job History Log analyzer.
  MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
  NNdataGenerator: Generate the data to be used by NNloadGenerator
  NNloadGenerator: Generate load on Namenode using NN loadgenerator run WITHOUT MR
  NNloadGeneratorMR: Generate load on Namenode using NN loadgenerator run as MR job
  NNstructureGenerator: Generate the structure to be used by NNdataGenerator
  SliveTest: HDFS Stress Test and Live Data Verification.
  TestDFSIO: Distributed i/o benchmark.
  fail: a job that always fails
  filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
  largesorter: Large-Sort tester
  loadgen: Generic map/reduce load generator
  mapredtest: A map/reduce test check.
  minicluster: Single process HDFS and MR cluster.
  mrbench: A map/reduce benchmark that can create many small jobs
  nnbench: A benchmark that stresses the namenode.
  sleep: A job that sleeps at each map and reduce task.
  testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
  testfilesystem: A test for FileSystem read/write.
  testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
  testsequencefile: A test for flat files of binary key value pairs.
  testsequencefileinputformat: A test for sequence file input format.
  testtextinputformat: A test for text input format.
  threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill

1 DFSCIOTest

没用，看不懂

2 TestDFSIO

测试hadoop的I/O速度

2.1 查看参数

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar TestDFSIO

2.2 用法

Usage: TestDFSIO [genericOptions] -read [-random | -backward | -skip [-skipSize Size]] | -write | -append | -truncate | -clean [-compression codecClassName] [-nrFiles N] [-size Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]

2.3 写

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar TestDFSIO -write -nrFiles 5 -size 10MB

TestDFSIO

查看文件

hadoop fs -ls /

2.4 读

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar TestDFSIO -read -nrFiles 5 -size 10MB

2.5 查看结果

cat TestDFSIO_results.log

2.6 删除测试文件

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar TestDFSIO -clean

3 mrbench

mrbench会多次重复执行一个小作业
用于检查在机群上小作业的运行是否可重复以及运行是否高效

3.1 查看参数

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar mrbench -help

3.2 用法

Usage: mrbench [-baseDir <base DFS path for output/input, default is /benchmarks/MRBench>] [-jar <local path to job jar file containing Mapper and Reducer implementations, default is current jar file>] [-numRuns <number of times to run the job, default is 1>] [-maps <number of maps for each run, default is 2>] [-reduces <number of reduces for each run, default is 1>] [-inputLines <number of input lines to generate, default is 1>] [-inputType <type of input to generate, one of ascending (default), descending, random>] [-verbose]

3.3 实例

使用3个mapper和3个reducer运行一个小作业20次，生成输入行数为5，降序排列

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar mrbench -numRuns 20 -maps 3 -reduces 3 -inputLines 5 -inputType descending

mrbench
mrbench
mrbench

4 nnbench

测试NameNode的负载
这个测试能在HDFS上创建、读取、重命名和删除文件操作

4.1 查看参数

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar nnbench

4.2 用法

Usage: nnbench <options>
Options:
	-operation <Available operations are create_write open_read rename delete. This option is mandatory>
	 * NOTE: The open_read, rename and delete operations assume that the files they operate on, are already available. The create_write operation must be run before running the other operations.
	-maps <number of maps. default is 1. This is not mandatory>
	-reduces <number of reduces. default is 1. This is not mandatory>
	-startTime <time to start, given in seconds from the epoch. Make sure this is far enough into the future, so all maps (operations) will start at the same time. default is launch time + 2 mins. This is not mandatory>
	-blockSize <Block size in bytes. default is 1. This is not mandatory>
	-bytesToWrite <Bytes to write. default is 0. This is not mandatory>
	-bytesPerChecksum <Bytes per checksum for the files. default is 1. This is not mandatory>
	-numberOfFiles <number of files to create. default is 1. This is not mandatory>
	-replicationFactorPerFile <Replication factor for the files. default is 1. This is not mandatory>
	-baseDir <base DFS path. default is /becnhmarks/NNBench. This is not mandatory>
	-readFileAfterOpen <true or false. if true, it reads the file and reports the average time to read. This is valid with the open_read operation. default is false. This is not mandatory>
	-help: Display the help statement

4.3 实例

使用3个mapper和3个reducer来创建100个文件

hadoop jar hadoop-mapreduce-client-jobclient-2.7.7-tests.jar nnbench -operation create_write -maps 3 -reduces 3 -numberOfFiles 100 -replicationFactorPerFile 3 -readFileAfterOpen true

nnbench
nnbench