1. Select 6亿条数据把ID Select处理时间
hive> INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/aaa' select id from a;
Total MapReduce jobs = 1
Launching Job 1 out of 1
2011-07-13 19:57:35,737 Stage-1 map = 0%, reduce = 0%
2011-07-13 19:57:40,786 Stage-1 map = 10%, reduce = 0%
2011-07-13 19:57:41,801 Stage-1 map = 24%, reduce = 0%
2011-07-13 19:57:42,811 Stage-1 map = 25%, reduce = 0%
Ended Job = job_201107111406_0013
Copying data to local directory /home/hadoop/aaa
Copying data to local directory /home/hadoop/aaa
612647594 Rows loaded to /home/hadoop/aaa
OK
Time taken: 182.608 seconds
2. Hive Count时间
Hive count 过程6亿多条记录花费31秒就可以Count完
> select count(1) from a;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
2011-07-13 20:01:00,272 Stage-1 map = 0%, reduce = 0%
2011-07-13 20:01:03,291 Stage-1 map = 4%, reduce = 0%
2011-07-13 20:01:04,300 Stage-1 map = 24%, reduce = 0%
Ended Job = job_201106111406_0014
OK
612647594
Time taken: 31.689 seconds
hive> INSERT OVERWRITE LOCAL DIRECTORY '/home/hadoop/aaa' select id from a;
Total MapReduce jobs = 1
Launching Job 1 out of 1
2011-07-13 19:57:35,737 Stage-1 map = 0%, reduce = 0%
2011-07-13 19:57:40,786 Stage-1 map = 10%, reduce = 0%
2011-07-13 19:57:41,801 Stage-1 map = 24%, reduce = 0%
2011-07-13 19:57:42,811 Stage-1 map = 25%, reduce = 0%
Ended Job = job_201107111406_0013
Copying data to local directory /home/hadoop/aaa
Copying data to local directory /home/hadoop/aaa
612647594 Rows loaded to /home/hadoop/aaa
OK
Time taken: 182.608 seconds
2. Hive Count时间
Hive count 过程6亿多条记录花费31秒就可以Count完
> select count(1) from a;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
2011-07-13 20:01:00,272 Stage-1 map = 0%, reduce = 0%
2011-07-13 20:01:03,291 Stage-1 map = 4%, reduce = 0%
2011-07-13 20:01:04,300 Stage-1 map = 24%, reduce = 0%
Ended Job = job_201106111406_0014
OK
612647594
Time taken: 31.689 seconds