基准测试
安装教程中的提示,利用hadoop自带的测试程序进行。
集群开始运行后,先给用户增加权限:
mnclab@lmn:~$ hadoop fs -chown mnclab:mnclab /user/mnclab
同时给用户设置hadoop的目录空间设置限制:
mnclab@lmn:~$ hadoop dfsadmin -setSpaceQuota 1t /user/mnclab
运行测试前需要关闭安全模式:
mnclab@lmn:~$ hadoop dfsadmin -safemode leave
打开测试程序:
mnclab@lmn:~$ hadoop jar /usr/local/hadoop/hadoop-test-1.0.4.jarTestDFSIO
TestDFSIO.0.0.4
Usage: TestDFSIO -read | -write| -clean [-nrFiles N] [-fileSize MB] [-resFile resultFileName] [-bufferSizeBytes]
Usage即为该测试程序的使用提示,分为读/写/清除3个指令,nrFiles为测试文件数目,fileSize为测试文件大小。
进行一次写测试,如下:
mnclab@lmn:~$ hadoop jar /usr/local/hadoop/hadoop-test-1.0.4.jarTestDFSIO -write -nrFiles 10 -fileSize 10000
进行一次读测试,确保读的文件之前已经写入了,如下:
mnclab@lmn:~$ hadoop jar /usr/local/hadoop/hadoop-test-1.0.4.jarTestDFSIO -read -nrFiles 10 -fileSize 10000
完成测试后,用-clean来清除测试用的数据,如下:
mnclab@lmn:~$ hadoop jar /usr/local/hadoop/hadoop-test-1.0.4.jarTestDFSIO -clean
写、读测试结果:
----- TestDFSIO ----- : write
Date & time: Mon Jan 12 15:06:47CST 2015
Number of files: 10
Total MBytes processed: 10000
Throughput mb/sec: 4.068040416795149
Average IO rate mb/sec:5.369630813598633
IO rate std deviation: 3.780532172485119
Test exec time sec: 394.507
----- TestDFSIO ----- : read
Date & time: Mon Jan 12 15:29:12CST 2015
Number of files: 10
Total MBytes processed: 10000
Throughput mb/sec: 13.675699922322025
Average IO rate mb/sec:18.236520767211914
IO rate std deviation: 9.58391832617692
Test exec time sec: 237.37
map/reduce测试
测试自带例子中的单词统计程序:
mnclab@lmn:~$ hadoop jar/usr/local/hadoop/hadoop-examples-1.0.4.jar wordcount in out
——输入数据在“in”目录中,输出内容防止“out”目录下
15/01/13 16:21:07 INFOinput.FileInputFormat: Total input paths to process : 2
15/01/13 16:21:07 INFOutil.NativeCodeLoader: Loaded the native-hadoop library
15/01/13 16:21:07 WARNsnappy.LoadSnappy: Snappy native library not loaded
15/01/13 16:21:07 INFOmapred.JobClient: Running job: job_201501131455_0004
15/01/13 16:21:08 INFOmapred.JobClient: map 0% reduce 0%
15/01/13 16:21:21 INFOmapred.JobClient: map 50% reduce 0%
15/01/13 16:21:23 INFOmapred.JobClient: map 100% reduce 0%
15/01/13 16:21:35 INFOmapred.JobClient: map 100% reduce 50%
15/01/13 16:21:39 INFOmapred.JobClient: map 100% reduce 100%
15/01/13 16:21:47 INFOmapred.JobClient: Job complete: job_201501131455_0004
15/01/13 16:21:47 INFOmapred.JobClient: Counters: 29
15/01/13 16:21:47 INFOmapred.JobClient: Job Counters
15/01/13 16:21:47 INFOmapred.JobClient: Launched reduce tasks=2
15/01/13 16:21:47 INFOmapred.JobClient: SLOTS_MILLIS_MAPS=17956
15/01/13 16:21:47 INFOmapred.JobClient: Total time spent by all reduces waiting after reserving slots(ms)=0
15/01/13 16:21:47 INFOmapred.JobClient: Total time spent by all maps waiting after reserving slots(ms)=0
15/01/13 16:21:47 INFOmapred.JobClient: Launched map tasks=2
15/01/13 16:21:47 INFOmapred.JobClient: Data-local map tasks=2
15/01/13 16:21:47 INFOmapred.JobClient: SLOTS_MILLIS_REDUCES=27794
15/01/13 16:21:47 INFOmapred.JobClient: File Output Format Counters
15/01/13 16:21:47 INFOmapred.JobClient: Bytes Written=15
15/01/13 16:21:47 INFOmapred.JobClient: FileSystemCounters
15/01/13 16:21:47 INFOmapred.JobClient: FILE_BYTES_READ=57
15/01/13 16:21:47 INFOmapred.JobClient: HDFS_BYTES_READ=248
15/01/13 16:21:47 INFOmapred.JobClient: FILE_BYTES_WRITTEN=87692
15/01/13 16:21:47 INFOmapred.JobClient: HDFS_BYTES_WRITTEN=15
15/01/13 16:21:47 INFOmapred.JobClient: File Input Format Counters
15/01/13 16:21:47 INFOmapred.JobClient: Bytes Read=18
15/01/13 16:21:47 INFOmapred.JobClient: Map-Reduce Framework
15/01/13 16:21:47 INFOmapred.JobClient: Map output materialized bytes=69
15/01/13 16:21:47 INFOmapred.JobClient: Map input records=2
15/01/13 16:21:47 INFOmapred.JobClient: Reduce shuffle bytes=54
15/01/13 16:21:47 INFOmapred.JobClient: Spilled Records=10
15/01/13 16:21:47 INFOmapred.JobClient: Map output bytes=42
15/01/13 16:21:47 INFOmapred.JobClient: Total committed heap usage (bytes)=381816832
15/01/13 16:21:47 INFOmapred.JobClient: CPU time spent (ms)=13190
15/01/13 16:21:47 INFOmapred.JobClient: Combine input records=6
15/01/13 16:21:47 INFOmapred.JobClient: SPLIT_RAW_BYTES=230
15/01/13 16:21:47 INFOmapred.JobClient: Reduce input records=5
15/01/13 16:21:47 INFOmapred.JobClient: Reduce input groups=3
15/01/13 16:21:47 INFOmapred.JobClient: Combine output records=5
15/01/13 16:21:47 INFOmapred.JobClient: Physical memory (bytes) snapshot=560418816
15/01/13 16:21:47 INFOmapred.JobClient: Reduce output records=3
15/01/13 16:21:47 INFOmapred.JobClient: Virtual memory (bytes) snapshot=3988848640
15/01/13 16:21:47 INFOmapred.JobClient: Map output records=6
查看输出文件夹中的内容:
mnclab@lmn:~$ hadoop dfs -ls ./out
Found 4 items
-rw-r--r-- 3 mnclab mnclab 0 2015-01-13 16:21/user/mnclab/out/_SUCCESS
drwxr-xr-x - mnclab mnclab 0 2015-01-13 16:21/user/mnclab/out/_logs
-rw-r--r-- 3 mnclab mnclab 10 2015-01-13 16:21/user/mnclab/out/part-r-00000
-rw-r--r-- 3 mnclab mnclab 5 2015-01-13 16:21/user/mnclab/out/part-r-00001
显示示例程序的结果:
mnclab@lmn:~/input$ hadoop dfs -cat ./out/*
da 1
de 3
df 2
大文件存储测试
通过dfs的-put指令,将input文件夹导入到in目录下:
mnclab@lmn:~$ hadoop dfs -put ./input in
导入成功后可查看导入的sample3文件(17.4GB):
mnclab@lmn:~$ hadoop dfs -ls ./in/input
Found 1 items
-rw-r--r-- 3 mnclab supergroup 17380824894 2015-01-1510:24 /user/mnclab/in/input/sample3
利用fsck工具查看文件所包含的块:
mnclab@lmn:~$ hadoop fsck /user/mnclab/in/input/sample3-files -blocks -racks
FSCK started by mnclab from/192.168.1.122 for path /user/mnclab/in/input/sample3 at Thu Jan 15 10:36:29CST 2015
/user/mnclab/in/input/sample317380824894 bytes, 33 block(s): OK
0. blk_-7493847470710219250_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010]
1. blk_-9008281405499505641_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010]
2. blk_3757727193501133903_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010]
3. blk_7741530916589711785_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
4. blk_1221984705640859177_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010]
5. blk_-3635602167364815171_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
6. blk_-2916977055490954899_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010]
7. blk_7181921784725396842_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
8. blk_-3256168732758515454_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
9. blk_-753396981526374508_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
10.blk_-3583462304774071834_2642 len=536870912 repl=3[/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
11.blk_-4844730020985125484_2642 len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010]
12.blk_-8335058804300479373_2642 len=536870912 repl=3[/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010, /default-rack/192.168.1.121:50010]
13. blk_568982061757350680_2642len=536870912 repl=3 [/default-rack/192.168.1.123:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.121:50010]
14. blk_2125778359175382499_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
15.blk_-5993555184871678394_2642 len=536870912 repl=3[/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010, /default-rack/192.168.1.124:50010]
16. blk_-2850401014666871274_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
17. blk_5747089105956576551_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
18.blk_-5407811149802838638_2642 len=536870912 repl=3[/default-rack/192.168.1.121:50010, /default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
19. blk_3372051075860231490_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010]
20.blk_-2578159294926485558_2642 len=536870912 repl=3[/default-rack/192.168.1.124:50010, /default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010]
21. blk_2985533564278413703_2642len=536870912 repl=3 [/default-rack/192.168.1.123:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.124:50010]
22. blk_-501680720738820702_2642len=536870912 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010]
23. blk_8866749491469304431_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
24. blk_2333164910342786098_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
25.blk_-1829912833460677536_2642 len=536870912 repl=3[/default-rack/192.168.1.121:50010, /default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
26.blk_-8130390531593335862_2642 len=536870912 repl=3[/default-rack/192.168.1.121:50010, /default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
27. blk_7166567844342564949_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
28. blk_5339539591785265973_2642len=536870912 repl=3 [/default-rack/192.168.1.123:50010,/default-rack/192.168.1.121:50010, /default-rack/192.168.1.124:50010]
29. blk_3398276298577443518_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.124:50010, /default-rack/192.168.1.123:50010]
30. blk_5885445885476201282_2642len=536870912 repl=3 [/default-rack/192.168.1.121:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010]
31. blk_2951416927177229476_2642len=536870912 repl=3 [/default-rack/192.168.1.123:50010, /default-rack/192.168.1.124:50010,/default-rack/192.168.1.121:50010]
32. blk_6061241691743661695_2642len=200955710 repl=3 [/default-rack/192.168.1.124:50010,/default-rack/192.168.1.123:50010, /default-rack/192.168.1.121:50010]
Status: HEALTHY
Total size: 17380824894B
Total dirs: 0
Total files: 1
Total blocks (validated): 33 (avg. block size 526691663 B)
Minimally replicated blocks: 33 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 3
Number of racks: 1
FSCK ended at Thu Jan 15 10:36:29CST 2015 in 2 milliseconds
The filesystem under path'/user/mnclab/in/input/sample3' is HEALTHY
执行下载,将文件复制到本地input目录下:
mnclab@lmn:~/input$ hadoop dfs -get/user/mnclab/in/input/sample3 ./
——耗时3分钟
同理可以执行复制、剪切和删除:
文件首先在/user/mnclab/in/input目录下
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in/input
Found 1 items
-rw-r--r-- 3 mnclab supergroup 17380824894 2015-01-1510:24 /user/mnclab/in/input/sample3
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in
Found 3 items
drwxr-xr-x - mnclab supergroup 0 2015-01-16 16:24/user/mnclab/in/input
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36/user/mnclab/in/test1.txt
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36 /user/mnclab/in/test2.txt
执行复制,将文件复制到/user/mnclab/in目录下:
mnclab@lmn:~$ hadoop dfs -cp /user/mnclab/in/input/sample3/user/mnclab/in
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in
Found 4 items
drwxr-xr-x - mnclab supergroup 0 2015-01-16 16:24/user/mnclab/in/input
-rw-r--r-- 3 mnclab supergroup 17380824894 2015-01-1616:25 /user/mnclab/in/sample3
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36/user/mnclab/in/test1.txt
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36/user/mnclab/in/test2.txt
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in/input
Found 1 items
-rw-r--r-- 3 mnclab supergroup 17380824894 2015-01-1510:24 /user/mnclab/in/input/sample3
——耗时10分钟
执行删除,删除input文件夹及其内部文件:
mnclab@lmn:~$ hadoop dfs -rmr -skipTrash/user/mnclab/in/input
Deletedhdfs://192.168.1.122:9000/user/mnclab/in/input
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in
Found 3 items
-rw-r--r-- 3 mnclab supergroup17380824894 2015-01-16 16:25 /user/mnclab/in/sample3
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36 /user/mnclab/in/test1.txt
-rw-r--r-- 3 mnclab supergroup 9 2015-01-13 15:36/user/mnclab/in/test2.txt
执行剪切,将文件从in文件夹移动到用户目录下:
mnclab@lmn:~$ hadoop dfs -mv /user/mnclab/in/sample3/user/mnclab
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab/in
Found 2 items
-rw-r--r-- 3 mnclabsupergroup 9 2015-01-13 15:36/user/mnclab/in/test1.txt
-rw-r--r-- 3 mnclabsupergroup 9 2015-01-13 15:36/user/mnclab/in/test2.txt
mnclab@lmn:~$ hadoop dfs -ls /user/mnclab
Found 3 items
drwxr-xr-x - mnclab supergroup 0 2015-01-16 16:41 /user/mnclab/in
drwxr-xr-x - mnclab mnclab 0 2015-01-13 16:21/user/mnclab/out
-rw-r--r-- 3 mnclab supergroup 17380824894 2015-01-16 16:25/user/mnclab/sample3