通过跑较大数据集测试Hadoop 2.4.1是否安装成功

1. 生成数据集testWordCount.txt

  • 生成代码如下
    package problem.forthstudy.test;
    
    import java.io.File;
    import java.io.FileWriter;
    
    public class GenerationFile {
        public static void main(String[] args) throws Exception{
            File f = new File("testWordCountFile.txt");
            FileWriter fw = new FileWriter(f);
            int i = 0;
            while(i<Math.pow(10,8))
            {
                fw.write("hello ");
                fw.write("how ");
                fw.write("are ");
                fw.write("you ");
                fw.write("\n");
                i++;
                if(i%100000==0)
                    System.out.println(i);
            }
            fw.close();
        }
    }
    

    生成文件大小为1.76GB,通过scp拷贝到集群上

        

 

## 提交文件到服务器
scp testWordCountFile.txt jyb@114.212.87.15:/home/jyb/Desktop/zmx/

## 提交文件到master节点
jyb@jyb:~/Desktop/zmx$ scp testWordCountFile.txt zmx@192.168.122.54:~/software/hadoop-2.1.4

## 查看文件,可以看到testWordCountFile.txt
zmx@master:~/software/hadoop-2.4.1$ ls -l
total 1855540
-rwxrwxrwx 1 67974 users      15458 Jun 21  2014 LICENSE.txt
-rwxrwxrwx 1 67974 users        101 Jun 21  2014 NOTICE.txt
-rwxrwxrwx 1 67974 users       1366 Jun 21  2014 README.txt
drwxrwxrwx 2 67974 users       4096 Jun 21  2014 bin
drwxrwxrwx 3 67974 users       4096 Jun 21  2014 etc
drwxrwxrwx 2 67974 users       4096 Jun 21  2014 include
drwxrwxrwx 3 67974 users       4096 Jun 21  2014 lib
drwxrwxrwx 2 67974 users       4096 Jun 21  2014 libexec
drwxrwxr-x 3 zmx   zmx         4096 Mar 18 12:57 logs
drwxrwxrwx 2 67974 users       4096 Jun 21  2014 sbin
drwxrwxrwx 4 67974 users       4096 Jun 21  2014 share
-rw-r--r-- 1 root  root          65 Mar 18 13:09 test.txt
-rw-r--r-- 1 zmx   zmx   1900000000 Mar 18 17:35 testWordCountFile.txt
drwxrwxr-x 4 zmx   zmx         4096 Mar 18 13:12 tmp

 

2. 将测试文件放入测试目录   

## 如果没有测试目录需要建立一个测试目录
zmx@master:~/software/hadoop-2.4.1$ hadoop fs -mkdir input

## 将文件放到测试目录中
zmx@master:~/software/hadoop-2.4.1$ hadoop fs -put testWordCountFile.txt input/

## 检验测试文件是否已经导入
zmx@master:~/software/hadoop-2.4.1$ hadoop fs -ls input/
19/03/18 17:38:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   2 zmx supergroup         65 2019-03-18 13:10 input/test.txt
-rw-r--r--   2 zmx supergroup 1900000000 2019-03-18 17:38 input/testWordCountFile.txt

 

3. 执行wordcount程序

    

## 执行WordCount程序
zmx@master:~/software/hadoop-2.4.1$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar wordcount input/testWordCountFile.txt outputlarge/

   预测结果:

   hello 100000000

   how   100000000

   are   100000000

   you   100000000 

 

4. 执行过程

    

## 执行过程
19/03/18 17:39:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/03/18 17:39:40 INFO client.RMProxy: Connecting to ResourceManager at /192.168.122.54:8032
19/03/18 17:39:41 INFO input.FileInputFormat: Total input paths to process : 1
19/03/18 17:39:41 INFO mapreduce.JobSubmitter: number of splits:15
19/03/18 17:39:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1552885048834_0002
19/03/18 17:39:41 INFO impl.YarnClientImpl: Submitted application application_1552885048834_0002
19/03/18 17:39:41 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1552885048834_0002/
19/03/18 17:39:41 INFO mapreduce.Job: Running job: job_1552885048834_0002
19/03/18 17:39:49 INFO mapreduce.Job: Job job_1552885048834_0002 running in uber mode : false
19/03/18 17:39:49 INFO mapreduce.Job:  map 0% reduce 0%
19/03/18 17:40:03 INFO mapreduce.Job:  map 2% reduce 0%
19/03/18 17:40:04 INFO mapreduce.Job:  map 4% reduce 0%
19/03/18 17:40:06 INFO mapreduce.Job:  map 11% reduce 0%
19/03/18 17:40:07 INFO mapreduce.Job:  map 12% reduce 0%
19/03/18 17:40:09 INFO mapreduce.Job:  map 16% reduce 0%
19/03/18 17:40:12 INFO mapreduce.Job:  map 19% reduce 0%
19/03/18 17:40:13 INFO mapreduce.Job:  map 21% reduce 0%
19/03/18 17:40:14 INFO mapreduce.Job:  map 22% reduce 0%
19/03/18 17:40:15 INFO mapreduce.Job:  map 24% reduce 0%
19/03/18 17:40:16 INFO mapreduce.Job:  map 26% reduce 0%
19/03/18 17:40:18 INFO mapreduce.Job:  map 28% reduce 0%
19/03/18 17:40:19 INFO mapreduce.Job:  map 30% reduce 0%
19/03/18 17:40:21 INFO mapreduce.Job:  map 31% reduce 0%
19/03/18 17:40:22 INFO mapreduce.Job:  map 33% reduce 0%
19/03/18 17:40:24 INFO mapreduce.Job:  map 35% reduce 2%
19/03/18 17:40:25 INFO mapreduce.Job:  map 37% reduce 2%
19/03/18 17:40:27 INFO mapreduce.Job:  map 39% reduce 2%
19/03/18 17:40:28 INFO mapreduce.Job:  map 41% reduce 2%
19/03/18 17:40:29 INFO mapreduce.Job:  map 42% reduce 2%
19/03/18 17:40:30 INFO mapreduce.Job:  map 44% reduce 2%
19/03/18 17:40:31 INFO mapreduce.Job:  map 47% reduce 2%
19/03/18 17:40:32 INFO mapreduce.Job:  map 50% reduce 2%
19/03/18 17:40:33 INFO mapreduce.Job:  map 51% reduce 4%
19/03/18 17:40:34 INFO mapreduce.Job:  map 52% reduce 4%
19/03/18 17:40:35 INFO mapreduce.Job:  map 59% reduce 4%
19/03/18 17:40:36 INFO mapreduce.Job:  map 59% reduce 13%
19/03/18 17:40:37 INFO mapreduce.Job:  map 61% reduce 13%
19/03/18 17:40:38 INFO mapreduce.Job:  map 63% reduce 13%
19/03/18 17:40:40 INFO mapreduce.Job:  map 68% reduce 13%
19/03/18 17:40:41 INFO mapreduce.Job:  map 70% reduce 13%
19/03/18 17:40:42 INFO mapreduce.Job:  map 70% reduce 20%
19/03/18 17:40:44 INFO mapreduce.Job:  map 71% reduce 20%
19/03/18 17:40:47 INFO mapreduce.Job:  map 73% reduce 20%
19/03/18 17:40:50 INFO mapreduce.Job:  map 74% reduce 20%
19/03/18 17:40:53 INFO mapreduce.Job:  map 75% reduce 20%
19/03/18 17:40:56 INFO mapreduce.Job:  map 77% reduce 20%
19/03/18 17:40:59 INFO mapreduce.Job:  map 78% reduce 20%
19/03/18 17:41:02 INFO mapreduce.Job:  map 79% reduce 20%
19/03/18 17:41:03 INFO mapreduce.Job:  map 80% reduce 20%
19/03/18 17:41:05 INFO mapreduce.Job:  map 81% reduce 20%
19/03/18 17:41:08 INFO mapreduce.Job:  map 84% reduce 20%
19/03/18 17:41:09 INFO mapreduce.Job:  map 84% reduce 22%
19/03/18 17:41:12 INFO mapreduce.Job:  map 85% reduce 22%
19/03/18 17:41:13 INFO mapreduce.Job:  map 86% reduce 22%
19/03/18 17:41:16 INFO mapreduce.Job:  map 87% reduce 22%
19/03/18 17:41:18 INFO mapreduce.Job:  map 90% reduce 22%
19/03/18 17:41:19 INFO mapreduce.Job:  map 93% reduce 24%
19/03/18 17:41:21 INFO mapreduce.Job:  map 98% reduce 24%
19/03/18 17:41:22 INFO mapreduce.Job:  map 100% reduce 31%
19/03/18 17:41:23 INFO mapreduce.Job:  map 100% reduce 100%
19/03/18 17:41:23 INFO mapreduce.Job: Job job_1552885048834_0002 completed successfully
19/03/18 17:41:23 INFO mapreduce.Job: Counters: 51
        File System Counters
                FILE: Number of bytes read=6822
                FILE: Number of bytes written=1493950
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=1900059144
                HDFS: Number of bytes written=58
                HDFS: Number of read operations=48
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Killed map tasks=5
                Launched map tasks=20
                Launched reduce tasks=1
                Data-local map tasks=15
                Rack-local map tasks=5
                Total time spent by all maps in occupied slots (ms)=1036981
                Total time spent by all reduces in occupied slots (ms)=70468
                Total time spent by all map tasks (ms)=1036981
                Total time spent by all reduce tasks (ms)=70468
                Total vcore-seconds taken by all map tasks=1036981
                Total vcore-seconds taken by all reduce tasks=70468
                Total megabyte-seconds taken by all map tasks=1061868544
                Total megabyte-seconds taken by all reduce tasks=72159232
        Map-Reduce Framework
                Map input records=100000000
                Map output records=400000000
                Map output bytes=3400000000
                Map output materialized bytes=762
                Input split bytes=1800
                Combine input records=400000504
                Combine output records=568
                Reduce input groups=4
                Reduce shuffle bytes=762
                Reduce input records=64
                Reduce output records=4
                Spilled Records=640
                Shuffled Maps =15
                Failed Shuffles=0
                Merged Map outputs=15
                GC time elapsed (ms)=5581
                CPU time spent (ms)=345470
                Physical memory (bytes) snapshot=4269223936
                Virtual memory (bytes) snapshot=11761364992
                Total committed heap usage (bytes)=3090677760
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=1900057344
        File Output Format Counters
                Bytes Written=58

 

6. 查看是否得到output文件

## 查看output文件
zmx@master:~/software/hadoop-2.4.1$ hadoop fs -ls outputlarge/
19/03/18 17:42:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   2 zmx supergroup          0 2019-03-18 17:41 outputlarge/_SUCCESS
-rw-r--r--   2 zmx supergroup         58 2019-03-18 17:41 outputlarge/part-r-00000

 

7. 查看output文件内容

## 和预测结果一致
zmx@master:~/software/hadoop-2.4.1$ hadoop fs -cat outputlarge/part-r-00000
19/03/18 17:44:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
are     100000000
hello   100000000
how     100000000
you     100000000

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值