设置mapjoin二种方式:会一种就足够了
第一种:
`set hive.auto.convert.join=true`;
查看是否设置成功:
set hive.auto.convert.join;
建表:
create table test1(cookieid string,cookietime string,pv int);
测试数据:
insert into test1 values(‘cookie1’,‘2017-12-10’,1);
insert into test1 values(‘cookie1’,‘2017-12-11’,5);
insert into test1 values(‘cookie1’,‘2017-12-12’,7);
insert into test1 values(‘cookie1’,‘2017-12-13’,3);
insert into test1 values(‘cookie1’,‘2017-12-14’,2);
insert into test1 values(‘cookie1’,‘2017-12-15’,4);
insert into test1 values(‘cookie1’,‘2017-12-16’,4);
insert into test1 values(‘cookie2’,‘2017-12-16’,6);
insert into test1 values(‘cookie2’,‘2017-12-12’,7);
insert into test1 values(‘cookie3’,‘2017-12-22’,5);
insert into test1 values(‘cookie2’,‘2017-12-24’,1);
insert into test1 values(‘a’,‘2017-12-01’,3);
insert into test1 values(‘b’,‘2017-12-00’,3);
第一种:
>
> set hive.auto.convert.join=true;
hive> set hive.auto.convert.join;
hive.auto.convert.join=true
hive> select t1.pv,t1.cookieid from test1 t1 join test1 t2 on t1.cookieid=t2.cookieid;
Query ID = hadoop_20190603212611_ebe1b1ab-e7dc-400e-a3bd-e00197cee052
Total jobs = 1
Execution log at: /tmp/hadoop/hadoop_20190603212611_ebe1b1ab-e7dc-400e-a3bd-e00197cee052.log
2019-06-03 21:26:22 Starting to launch local task to process map join; maximum memory = 518979584
2019-06-03 21:26:26 Dump the side-table for tag: 0 with group count: 5 into file: file:/tmp/hadoop/c8519b0f-a173-4caa-b9b1-9365ec863423/hive_2019-06-03_21-26-11_597_4646555486833218992-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile10--.hashtable
2019-06-03 21:26:26 Uploaded 1 File to: file:/tmp/hadoop/c8519b0f-a173-4caa-b9b1-9365ec863423/hive_2019-06-03_21-26-11_597_4646555486833218992-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile10--.hashtable (431 bytes)
2019-06-03 21:26:26 End of local task; Time Taken: 3.913 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1559615069424_0004, Tracking URL = http://hadoop01:8088/proxy/application_1559615069424_0004/
Kill Command = /home/hadoop/apps/hadoop-2.6.5/bin/hadoop job -kill job_1559615069424_0004
Hadoop job information for Stage-3: number of mappers: 2; number of reducers: 0
2019-06-03 21:26:49,625 Stage-3 map = 0%, reduce = 0%
2019-06-03 21:27:12,749 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 7.01 sec
MapReduce Total cumulative CPU time: 7 seconds 10 msec
Ended Job = job_1559615069424_0004
MapReduce Jobs Launched:
Stage-Stage-3: Map: 2 Cumulative CPU: 7.01 sec HDFS Read: 12050 HDFS Write: 598 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 10 msec
OK
1 cookie1
5 cookie1
7 cookie1
3 cookie1
2 cookie1
4 cookie1
4 cookie1
1 cookie1
5 cookie1
7 cookie1
3 cookie1
2 cookie1
4 cookie1
4 cookie1
3 a
Starting to launch local task to process map join;上图发现这个说明使用mapjoin