将hive的Job交给yarn执行太慢,hive支持本地模式
set hive.exec.mode.local.auto=true;默认为false
测试例子:
试验:先创建一张表,指定文件格式为sequencefile
create table t_seq(id int, name string, addr string)
stored as sequencefile;
然后往表中插入数据,hive就会生成sequence文件插入表目录中
insert into table t_seq
select * from t_1;
非本地模式:(执行很慢)
0: jdbc:hive2://master.hadoop:10000> create table t_seq(id int, name string, addr string)
0: jdbc:hive2://master.hadoop:10000> stored as sequencefile;
No rows affected (0.936 seconds)
0: jdbc:hive2://master.hadoop:10000> desc t_seq;
+-----------+------------+----------+--+
| col_name | data_type | comment |
+-----------+------------+----------+--+
| id | int | |
| name | string | |
| addr | string | |
+-----------+------------+----------+--+
3 rows selected (0.271 seconds)
0: jdbc:hive2://master.hadoop:10000> insert into table t_seq
0: jdbc:hive2://master.hadoop:10000> select * from t_1;
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for job: job_1535503381660_0001
INFO : The url to track the job: http://master.hadoop:8088/proxy/application_1535503381660_0001/
INFO : Starting Job = job_1535503381660_0001, Tracking URL = http://master.hadoop:8088/proxy/application_1535503381660_0001/
INFO : Kill Command = /apps/hadoop-2.8.0//bin/hadoop job -kill job_1535503381660_0001
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
INFO : 2018-08-29 09:34:51,400 Stage-1 map = 0%, reduce = 0%
INFO : 2018-08-29 09:35:50,891 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.8 sec
INFO : MapReduce Total cumulative CPU time: 2 seconds 800 msec
INFO : Ended Job = job_1535503381660_0001
INFO : Stage-4 is selected by condition resolver.
INFO : Stage-3 is filtered out by condition resolver.
INFO : Stage-5 is filtered out by condition resolver.
INFO : Moving data to: hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-33-06_088_8510141274510902888-1/-ext-10000 from hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-33-06_088_8510141274510902888-1/-ext-10002
INFO : Loading data to table mydb.t_seq from hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-33-06_088_8510141274510902888-1/-ext-10000
INFO : Table mydb.t_seq stats: [numFiles=1, numRows=6, totalSize=249, rawDataSize=84]
No rows affected (173.614 seconds)
0: jdbc:hive2://master.hadoop:10000>
0: jdbc:hive2://master.hadoop:10000> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| t_1 |
| t_seq |
+-----------+--+
2 rows selected (0.237 seconds)
0: jdbc:hive2://master.hadoop:10000> select * from t_seq;
+-----------+-------------+-------------+--+
| t_seq.id | t_seq.name | t_seq.addr |
+-----------+-------------+-------------+--+
| 1 | user1 | 123123 |
| 2 | user2 | 123123 |
| 3 | user3 | 123123 |
| 1 | user1 | 123123 |
| 2 | user2 | 123123 |
| 3 | user3 | 123123 |
+-----------+-------------+-------------+--+
6 rows selected (0.28 seconds)
0: jdbc:hive2://master.hadoop:10000>
本地模式:(注意执行之前添加这句set hive.exec.mode.local.auto=true;)
0: jdbc:hive2://master.hadoop:10000> drop table t_seq;
No rows affected (0.474 seconds)
0: jdbc:hive2://master.hadoop:10000> create table t_seq(id int, name string, addr string)
0: jdbc:hive2://master.hadoop:10000> stored as sequencefile;
No rows affected (0.236 seconds)
0: jdbc:hive2://master.hadoop:10000> desc t_seq;
+-----------+------------+----------+--+
| col_name | data_type | comment |
+-----------+------------+----------+--+
| id | int | |
| name | string | |
| addr | string | |
+-----------+------------+----------+--+
3 rows selected (0.225 seconds)
0: jdbc:hive2://master.hadoop:10000> set hive.exec.mode.local.auto=true;
No rows affected (0.055 seconds)
0: jdbc:hive2://master.hadoop:10000> insert into table t_seq
0: jdbc:hive2://master.hadoop:10000> select * from t_1;
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for job: job_local1471930920_0001
INFO : The url to track the job: http://localhost:8080/
INFO : Job running in-process (local Hadoop)
INFO : 2018-08-29 09:42:15,722 Stage-1 map = 100%, reduce = 0%
INFO : Ended Job = job_local1471930920_0001
INFO : Stage-4 is selected by condition resolver.
INFO : Stage-3 is filtered out by condition resolver.
INFO : Stage-5 is filtered out by condition resolver.
INFO : Moving data to: hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-42-13_245_3754575447692258889-1/-ext-10000 from hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-42-13_245_3754575447692258889-1/-ext-10002
INFO : Loading data to table mydb.t_seq from hdfs://ns/user/hive/warehouse/mydb.db/t_seq/.hive-staging_hive_2018-08-29_09-42-13_245_3754575447692258889-1/-ext-10000
INFO : Table mydb.t_seq stats: [numFiles=1, numRows=6, totalSize=249, rawDataSize=84]
No rows affected (3.024 seconds)
0: jdbc:hive2://master.hadoop:10000> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| t_1 |
| t_seq |
+-----------+--+
2 rows selected (0.068 seconds)
0: jdbc:hive2://master.hadoop:10000> select * from t_seq;
+-----------+-------------+-------------+--+
| t_seq.id | t_seq.name | t_seq.addr |
+-----------+-------------+-------------+--+
| 1 | user1 | 123123 |
| 2 | user2 | 123123 |
| 3 | user3 | 123123 |
| 1 | user1 | 123123 |
| 2 | user2 | 123123 |
| 3 | user3 | 123123 |
+-----------+-------------+-------------+--+
6 rows selected (0.281 seconds)
0: jdbc:hive2://master.hadoop:10000>