1.fetch配置
[cevent@hadoop207 ~]$ cd /opt/module/hive-1.2.1/
[cevent@hadoop207 hive-1.2.1]$ ll
总用量 524
drwxrwxr-x. 3 cevent cevent 4096 4月 30 15:59 bin
drwxrwxr-x. 2 cevent cevent 4096 5月 9 18:40 conf
-rw-rw-r--. 1 cevent cevent 20403 6月 5 13:48 derby.log
drwxrwxr-x. 4 cevent cevent 4096 4月 30 15:59 examples
drwxrwxr-x. 7 cevent cevent 4096 4月 30 15:59 hcatalog
-rw-rw-r--. 1 cevent cevent 23 5月 9 13:37 hive01.sql
drwxrwxr-x. 4 cevent cevent 4096 5月 7 13:51 lib
-rw-rw-r--. 1 cevent cevent 24754 4月 30 2015 LICENSE
drwxrwxr-x. 2 cevent cevent 4096 6月 7 13:22 logs
drwxrwxr-x. 5 cevent cevent 4096 4月 30 16:44 metastore_db
-rw-rw-r--. 1 cevent cevent 397 6月 19 2015 NOTICE
-rw-rw-r--. 1 cevent cevent 4366 6月 19 2015 README.txt
-rw-rw-r--. 1 cevent cevent 421129 6月 19 2015 RELEASE_NOTES.txt
-rw-rw-r--. 1 cevent cevent 11 5月 9 13:27 result.txt
drwxrwxr-x. 3 cevent cevent 4096 4月 30 15:59 scripts
-rw-rw-r--. 1 cevent cevent 171 5月 9 13:24 server.log
-rw-rw-r--. 1 cevent cevent 5 5月 8 14:05 server.pid
-rw-rw-r--. 1 cevent cevent 326 5月 22 13:34 student.txt
[cevent@hadoop207 hive-1.2.1]$ bin/beeline
Beeline version 1.2.1 by Apache Hive
beeline> !connect
jdbc:hive2://hadoop207.cevent.com:10000
Connecting to jdbc:hive2://hadoop207.cevent.com:10000
Enter username for
jdbc:hive2://hadoop207.cevent.com:10000: cevent
Enter password for
jdbc:hive2://hadoop207.cevent.com:10000: ******
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation:
TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://hadoop207.cevent.com:10000>
show databases;
+----------------+--+
| database_name |
+----------------+--+
| cevent01 |
| default |
+----------------+--+
2 rows selected (1.695 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> use
cevent01;
No rows affected (0.073 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000>
set
hive.fetch.task.conversion=none; 将fetch转换为none,开启mapreduce模式
No rows affected (0.03 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
select * from c_emp;
INFO
: Number of reduce tasks is set to 0 since there's no reduce operator
INFO
: number of splits:1
INFO
: Submitting tokens for job: job_1591539871344_0001
INFO
: The url to track the job: http://hadoop207.cevent.com:8088/proxy/application_1591539871344_0001/
INFO
: Starting Job = job_1591539871344_0001, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591539871344_0001/
INFO
: Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop job -kill job_1591539871344_0001
INFO
: Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 0
INFO
: 2020-06-07 22:30:41,881 Stage-1 map = 0%, reduce = 0%
INFO
: 2020-06-07 22:30:53,412 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.11 sec
INFO
: MapReduce Total cumulative CPU time: 1 seconds 110 msec
INFO
: Ended Job = job_1591539871344_0001
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
| c_emp.empno |
c_emp.ename | c_emp.jobs | c_emp.manager | c_emp.hiredata | c_emp.salary | c_emp.comment | c_emp.depno |
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
| NULL | cevent | NULL | NULL | NULL | NULL | NULL | 619 |
| 7369 | SMITH | CLERK | 7902 | 1980-12-17 | 800.0 | NULL | 20 |
| 7499 | ALLEN | SALESMAN | 7698 | 1981-2-20 | 1600.0 | 300.0 | 30 |
| 7521 | WARD | SALESMAN | 7698 | 1981-2-22 | 1250.0 | 500.0 | 30 |
| 7566 | JONES | MANAGER | 7839 | 1981-4-2 | 2975.0 | NULL | 20 |
| 7654 | MARTIN | SALESMAN | 7698 | 1981-9-28 | 1250.0 | 1400.0 | 30 |
| 7698 | BLAKE | MANAGER | 7839 | 1981-5-1 | 2850.0 | NULL | 30 |
| 7782 | CLARK | MANAGER | 7839 | 1981-6-9 | 2450.0 | NULL | 10 |
| 7788 | SCOTT | ANALYST | 7566 | 1987-4-19 | 3000.0 | NULL | 20 |
| 7839 | KING | PRESIDENT | NULL | 1981-11-17 | 5000.0 | NULL | 10 |
| 7844 | TURNER | SALESMAN | 7698 | 1981-9-8 | 1500.0 | 0.0 | 30 |
| 7876 | ADAMS | CLERK | 7788 | 1987-5-23 | 1100.0 | NULL | 20 |
| 7900 | JAMES | CLERK | 7698 | 1981-12-3 | 950.0 | NULL | 30 |
| 7902 | FORD | ANALYST | 7566 | 1981-12-3 | 3000.0 | NULL | 20 |
| 7934 | MILLER | CLERK | 7782 | 1982-1-23 | 1300.0 | NULL | 10 |
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
15 rows selected (30.546 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.fetch.task.conversion=more;
将fetch模式设置为more不开启mapreduce,效率更高
No rows affected (0.006 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
select * from c_emp;
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
| c_emp.empno |
c_emp.ename | c_emp.jobs | c_emp.manager | c_emp.hiredata | c_emp.salary | c_emp.comment | c_emp.depno |
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
| NULL | cevent | NULL | NULL | NULL | NULL | NULL | 619 |
| 7369 | SMITH | CLERK | 7902 | 1980-12-17 | 800.0 | NULL | 20 |
| 7499 | ALLEN | SALESMAN | 7698 | 1981-2-20 | 1600.0 | 300.0 | 30 |
| 7521 | WARD | SALESMAN | 7698 | 1981-2-22 | 1250.0 | 500.0 | 30 |
| 7566 | JONES | MANAGER | 7839 | 1981-4-2 | 2975.0 | NULL |
20 |
| 7654 | MARTIN | SALESMAN | 7698 | 1981-9-28 | 1250.0 | 1400.0 | 30 |
| 7698 | BLAKE | MANAGER | 7839 | 1981-5-1 | 2850.0 | NULL | 30 |
| 7782 | CLARK | MANAGER | 7839 | 1981-6-9 | 2450.0 | NULL | 10 |
| 7788 | SCOTT | ANALYST | 7566 | 1987-4-19 | 3000.0 | NULL | 20 |
| 7839 | KING | PRESIDENT | NULL | 1981-11-17 | 5000.0 | NULL | 10 |
| 7844 | TURNER | SALESMAN | 7698 | 1981-9-8 | 1500.0 | 0.0
| 30 |
| 7876 | ADAMS | CLERK | 7788 | 1987-5-23 | 1100.0 | NULL | 20 |
| 7900 | JAMES | CLERK | 7698 | 1981-12-3 | 950.0 | NULL | 30 |
| 7902 | FORD | ANALYST | 7566 | 1981-12-3 | 3000.0 | NULL | 20 |
| 7934 | MILLER | CLERK | 7782 | 1982-1-23 | 1300.0 | NULL | 10 |
+--------------+--------------+-------------+----------------+-----------------+---------------+----------------+--------------+--+
15 rows selected (0.513 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
set
hive.exec.mode.local.auto=true; 开启本地模式,执行mapreduce,但效率较fetch none更高,谨慎使用本地模式
No rows affected (0.005 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> select count(*) from c_emp;
INFO
: Number of reduce tasks determined at compile time: 1
INFO
: In order to change the average load for a reducer (in bytes):
INFO
: set
hive.exec.reducers.bytes.per.reducer=<number>
INFO
: In order to limit the maximum number of reducers:
INFO
: set
hive.exec.reducers.max=<number>
INFO
: In order to set a constant number of reducers:
INFO
: set
mapreduce.job.reduces=<number>
INFO
: number of splits:1
INFO
: Submitting tokens for job: job_local1010738823_0001
INFO
: The url to track the job: http://localhost:8080/
INFO
: Job running in-process (local Hadoop)
INFO
: 2020-06-07 22:33:37,811 Stage-1 map = 100%, reduce = 100%
INFO
: Ended Job = job_local1010738823_0001
+------+--+
| _c0
|
+------+--+
| 15
|
+------+--+
1 row selected (1.613 seconds)
2.元数据准备
3.修改mapreduce默认hive开启的数量
[cevent@hadoop207 module]$ cd hive-1.2.1/
[cevent@hadoop207 hive-1.2.1]$ cd conf/
[cevent@hadoop207 conf]$ ll
总用量 192
-rw-rw-r--. 1 cevent cevent 1139 4月 30 2015
beeline-log4j.properties.template
-rw-rw-r--. 1 cevent cevent 168431 6月 19 2015 hive-default.xml.template
-rw-rw-r--. 1 cevent cevent 2464 4月 30 16:06 hive-env.sh
-rw-rw-r--. 1 cevent cevent 2662 4月 30 2015
hive-exec-log4j.properties.template
-rw-rw-r--. 1 cevent cevent 3093 5月 9 18:12
hive-log4j.properties
-rw-rw-r--. 1 cevent cevent 1354 5月 9 18:40 hive-site.xml
-rw-rw-r--. 1 cevent cevent 1593 4月 30 2015 ivysettings.xml
[cevent@hadoop207 conf]$ cat hive-site.xml 查看hive配置
<?xml version="1.0"?>
<?xml-stylesheet
type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--mySQL数据库地址-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop207.cevent.com:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC
metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC
metastore</description>
</property>
<!--mySQL数据库访问用户名及密码-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore
database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>cevent</value>
<description>password to use
against metastore database</description>
</property>
<!-- 自定义hive查询显示的信息 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<!--修改默认的reduce生成数量 ,去掉-->
</property>
<name>mapreduce.job.reduce</name>
<value>4</value>
</property>
</configuration>
[cevent@hadoop207 conf]$ vim hive-site.xml 修改配置
</property>
<value>4</value>
<name>mapreduce.job.reduce</name>
</property>
<?xml version="1.0"?>
<?xml-stylesheet
type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--mySQL数据库地址-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://hadoop207.cevent.com:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC
metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC
metastore</description>
</property>
<!--mySQL数据库访问用户名及密码-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>cevent</value>
<description>password to use against metastore
database</description>
</property>
<!-- 自定义hive查询显示的信息 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<!--修改默认的reduce生成数量-->
</configuration>
4.插入数据
0: jdbc:hive2://hadoop207.cevent.com:10000>
create table big_join(
0:
jdbc:hive2://hadoop207.cevent.com:10000> id
bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> time
bigint,
0: jdbc:hive2://hadoop207.cevent.com:10000> uid string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> keyword
string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> url_rank
int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_num
int,
0: jdbc:hive2://hadoop207.cevent.com:10000> click_url string
0: jdbc:hive2://hadoop207.cevent.com:10000> )
0:
jdbc:hive2://hadoop207.cevent.com:10000> row
format delimited fields terminated by '\t';
No rows affected (0.467 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> create
table small_join(
0: jdbc:hive2://hadoop207.cevent.com:10000> id bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> time
bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> uid
string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> keyword
string,
0: jdbc:hive2://hadoop207.cevent.com:10000>
url_rank int,
0: jdbc:hive2://hadoop207.cevent.com:10000>
click_num int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_url
string
0:
jdbc:hive2://hadoop207.cevent.com:10000> )
0:
jdbc:hive2://hadoop207.cevent.com:10000> row
format delimited fields terminated by '\t';
No rows affected (0.1 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> create
table join_table(
0:
jdbc:hive2://hadoop207.cevent.com:10000> id
bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> time
bigint,
0: jdbc:hive2://hadoop207.cevent.com:10000> uid string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> keyword
string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> url_rank
int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_num
int,
0: jdbc:hive2://hadoop207.cevent.com:10000>
click_url string
0: jdbc:hive2://hadoop207.cevent.com:10000>
)
0:
jdbc:hive2://hadoop207.cevent.com:10000> row
format delimited fields terminated by '\t';
No rows affected (0.1 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> load data
local inpath '/opt/module/datas/bigtable' into table big_join;
INFO
: Loading data to table cevent01.big_join from
file:/opt/module/datas/bigtable
INFO
: Table cevent01.big_join stats: [numFiles=1, totalSize=120734753]
No rows affected (14.891 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000> load data local inpath '/opt/module/datas/smalltable' into
table small_join;
INFO
: Loading data to table cevent01.small_join from
file:/opt/module/datas/smalltable
INFO
: Table cevent01.small_join stats: [numFiles=1, totalSize=12018355]
No rows affected (1.723 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> select *
from big_join limit 5;
+--------------+-----------------+-----------------------------------+-------------------+--------------------+---------------------+----------------------------------------------------------------+--+
| big_join.id | big_join.time |
big_join.uid |
big_join.keyword |
big_join.url_rank |
big_join.click_num | big_join.click_url |
+--------------+-----------------+-----------------------------------+-------------------+--------------------+---------------------+----------------------------------------------------------------+--+
| 0 | 20111230000005 | 57375476989eea12893c0c3811607bcf | 奇艺高清 | 1 | 1 |
http://www.qiyi.com/ |
| 0 |
20111230000005 |
66c5bb7774e31d0a22278249b26bc83a | 凡人修仙传 | 3 | 1 |
http://www.booksky.org/BookDetail.aspx?BookID=1050804&Level=1 |
| 0 |
20111230000007 |
b97920521c78de70ac38e3713f524b50 | 本本联盟 | 1 | 1 | http://www.bblianmeng.com/ |
| 0 |
20111230000008 |
6961d0c97fe93701fc9c0d861d096cd9 | 华南师范大学图书馆 | 1 | 1 |
http://lib.scnu.edu.cn/ |
| 0 | 20111230000008 | f2f5a21c764aebde1e8afcc2871e086f | 在线代理 | 2 | 1 | http://proxyie.cn/ |
+--------------+-----------------+-----------------------------------+-------------------+--------------------+---------------------+----------------------------------------------------------------+--+
5 rows selected (0.613 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> select *
from small_join limit 5;
+----------------+------------------+-----------------------------------+---------------------+----------------------+-----------------------+----------------------------------------------------------------+--+
| small_join.id |
small_join.time | small_join.uid | small_join.keyword | small_join.url_rank | small_join.click_num | small_join.click_url |
+----------------+------------------+-----------------------------------+---------------------+----------------------+-----------------------+----------------------------------------------------------------+--+
| 95048 |
20111230000005 | 57375476989eea12893c0c3811607bcf | 奇艺高清 | 1 | 1 | http://www.qiyi.com/ |
| 95096 |
20111230000005 |
66c5bb7774e31d0a22278249b26bc83a | 凡人修仙传 | 3 | 1 |
http://www.booksky.org/BookDetail.aspx?BookID=1050804&Level=1 |
| 95144 |
20111230000007 |
b97920521c78de70ac38e3713f524b50 | 本本联盟 | 1 | 1 | http://www.bblianmeng.com/ |
| 95192 | 20111230000008 | 6961d0c97fe93701fc9c0d861d096cd9 | 华南师范大学图书馆 | 1 | 1 |
http://lib.scnu.edu.cn/ |
| 95192 |
20111230000008 |
f2f5a21c764aebde1e8afcc2871e086f | 在线代理 | 2 | 1 |
http://proxyie.cn/
|
+----------------+------------------+-----------------------------------+---------------------+----------------------+-----------------------+----------------------------------------------------------------+--+
5 rows selected (0.118 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.auto.convert.join=false; //将mapjoin打开效率更高,true
No rows affected (0.017 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
insert overwrite table join_table
0:
jdbc:hive2://hadoop207.cevent.com:10000> select
b.id,b.time,b.uid,b.keyword,b.url_rank,b.click_num,b.click_url
0:
jdbc:hive2://hadoop207.cevent.com:10000> from
small_join s
0: jdbc:hive2://hadoop207.cevent.com:10000> join big_join b
0:
jdbc:hive2://hadoop207.cevent.com:10000> on
b.id=s.id;
INFO
: Number of reduce tasks not specified. Estimated from input data
size: 1
INFO
: In order to change the average load for a reducer (in bytes):
INFO
: set hive.exec.reducers.bytes.per.reducer=<number>
INFO
: In order to limit the maximum number of reducers:
INFO
: set
hive.exec.reducers.max=<number>
INFO
: In order to set a constant number of reducers:
INFO
: set mapreduce.job.reduces=<number>
INFO
: number of splits:2
INFO
: Submitting tokens for job: job_1591593753790_0001
INFO
: The url to track the job:
http://hadoop207.cevent.com:8088/proxy/application_1591593753790_0001/
INFO
: Starting Job = job_1591593753790_0001, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591593753790_0001/
INFO
: Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop job -kill job_1591593753790_0001
INFO
: Hadoop job information for Stage-1: number of mappers: 2; number of
reducers: 1
INFO
: 2020-06-08 13:41:29,289 Stage-1 map = 0%, reduce = 0%
INFO
: 2020-06-08 13:42:12,682 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 13.84 sec
INFO
: 2020-06-08 13:42:33,048 Stage-1 map = 100%, reduce = 99%, Cumulative CPU 17.94 sec
INFO
: 2020-06-08 13:42:34,442 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 23.18 sec
INFO
: MapReduce Total cumulative CPU time: 23 seconds 180 msec
INFO
: Ended Job = job_1591593753790_0001
INFO
: Loading data to table cevent01.join_table from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_table/.hive-staging_hive_2020-06-08_13-41-10_494_1851985601125007593-1/-ext-10000
INFO
: Table cevent01.join_table stats: [numFiles=1, numRows=999990,
totalSize=123817216, rawDataSize=122817226]
No rows affected (95.171 seconds)
5.SQL解析
//hive表优化
//1.join
//创建大表
create table big_join(
id bigint,
time bigint,
uid string,
keyword string,
url_rank int,
click_num int,
click_url string
)
row format delimited fields terminated by
'\t';
//创建小表
create table small_join(
id bigint,
time bigint,
uid string,
keyword string,
url_rank int,
click_num int,
click_url string
)
row format delimited fields terminated by
'\t';
//创建join后表的语句
create table join_table(
id bigint,
time bigint,
uid string,
keyword string,
url_rank int,
click_num int,
click_url string
)
row format delimited fields terminated by
'\t';
//加载数据
load data local inpath
'/opt/module/datas/bigtable' into table big_join;
load data local inpath '/opt/module/datas/smalltable'
into table small_join;
//关闭hive.auto.convert.join=false;关闭map join
//关闭本地模式,影响效率:set hive.exec.mode.local.auto=false;
//执行查询插入,小表插入大表
insert overwrite table join_table
select b.id,b.time,b.uid,b.keyword,b.url_rank,b.click_num,b.click_url
from small_join s
join big_join b
on b.id=s.id;
//执行查询插入,大表插入小表
insert overwrite table join_table
select
b.id,b.time,b.uid,b.keyword,b.url_rank,b.click_num,b.click_url
from big_join b
join small_join s
on s.id=b.id;
6.准备原始数据ori
7.执行插入
[cevent@hadoop207 hive-1.2.1]$ bin/beeline
Beeline version 1.2.1 by Apache Hive
beeline>
!connect jdbc:hive2://hadoop207.cevent.com:10000
Connecting to jdbc:hive2://hadoop207.cevent.com:10000
Enter username for
jdbc:hive2://hadoop207.cevent.com:10000: cevent
Enter password for jdbc:hive2://hadoop207.cevent.com:10000:
******
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation:
TRANSACTION_REPEATABLE_READ
0:
jdbc:hive2://hadoop207.cevent.com:10000> show
databases;
+----------------+--+
| database_name |
+----------------+--+
| cevent01 |
| default |
+----------------+--+
2 rows selected (1.77 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> use
cevent01;
No rows affected (0.053 seconds) 插入join表数据
0: jdbc:hive2://hadoop207.cevent.com:10000> insert overwrite table join_table
0:
jdbc:hive2://hadoop207.cevent.com:10000> select
b.id,b.time,b.uid,b.keyword,b.url_rank,b.click_num,b.click_url
0:
jdbc:hive2://hadoop207.cevent.com:10000> from
small_join s
0: jdbc:hive2://hadoop207.cevent.com:10000> left join big_join b
0:
jdbc:hive2://hadoop207.cevent.com:10000> on
b.id=s.id;
INFO : Stage-1 is selected by
condition resolver.
INFO : Number of reduce tasks
not specified. Estimated from input data size: 1
INFO : In order to change the
average load for a reducer (in bytes):
INFO : set
hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the
maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a
constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:2
INFO : Submitting tokens for
job: job_1591675698957_0001
INFO : The url to track the
job: http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0001/
INFO : Starting Job = job_1591675698957_0001,
Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0001/
INFO : Kill Command =
/opt/module/hadoop-2.7.2/bin/hadoop job
-kill job_1591675698957_0001
INFO : Hadoop job information
for Stage-1: number of mappers: 2; number of reducers: 1
INFO : 2020-06-09 12:13:16,419
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 12:13:43,229
Stage-1 map = 50%, reduce = 0%,
Cumulative CPU 5.68 sec
INFO : 2020-06-09 12:13:57,087
Stage-1 map = 100%, reduce = 0%, Cumulative
CPU 16.96 sec
INFO : 2020-06-09 12:14:12,299
Stage-1 map = 100%, reduce = 80%,
Cumulative CPU 22.19 sec
INFO : 2020-06-09 12:14:15,774
Stage-1 map = 100%, reduce = 98%,
Cumulative CPU 25.13 sec
INFO : 2020-06-09 12:14:24,366
Stage-1 map = 100%, reduce = 100%,
Cumulative CPU 32.81 sec
INFO : MapReduce Total
cumulative CPU time: 32 seconds 810 msec
INFO : Ended Job =
job_1591675698957_0001
INFO : Loading data to table
cevent01.join_table from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_table/.hive-staging_hive_2020-06-09_12-12-53_365_6693250539177613475-1/-ext-10000
INFO : Table
cevent01.join_table stats: [numFiles=1, numRows=999990, totalSize=123817216,
rawDataSize=122817226]
No rows affected (96.166 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
create table join_original( 创建原始数据表
0: jdbc:hive2://hadoop207.cevent.com:10000> id bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> time
bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> uid
string,
0: jdbc:hive2://hadoop207.cevent.com:10000> keyword string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> url_rank
int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_num
int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_url
string
0: jdbc:hive2://hadoop207.cevent.com:10000> )
0: jdbc:hive2://hadoop207.cevent.com:10000>
row format delimited fields terminated by '\t';
No rows affected (0.608 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> create
table join_original_null_id( 创建空key表
0: jdbc:hive2://hadoop207.cevent.com:10000>
id bigint,
0: jdbc:hive2://hadoop207.cevent.com:10000>
time bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> uid
string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> keyword
string,
0: jdbc:hive2://hadoop207.cevent.com:10000>
url_rank int,
0: jdbc:hive2://hadoop207.cevent.com:10000>
click_num int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_url
string
0:
jdbc:hive2://hadoop207.cevent.com:10000> )
0:
jdbc:hive2://hadoop207.cevent.com:10000> row
format delimited fields terminated by '\t';
No rows affected (0.135 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> create
table join_original_table( 创建join表
0:
jdbc:hive2://hadoop207.cevent.com:10000> id
bigint,
0:
jdbc:hive2://hadoop207.cevent.com:10000> time
bigint,
0: jdbc:hive2://hadoop207.cevent.com:10000> uid string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> keyword
string,
0:
jdbc:hive2://hadoop207.cevent.com:10000> url_rank
int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> click_num
int,
0: jdbc:hive2://hadoop207.cevent.com:10000>
click_url string
0: jdbc:hive2://hadoop207.cevent.com:10000>
)
0:
jdbc:hive2://hadoop207.cevent.com:10000> row
format delimited fields terminated by '\t';
No rows affected (0.122 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000> load data local inpath '/opt/module/datas/ori' into table
join_original;
INFO
: Loading data to table cevent01.join_original from
file:/opt/module/datas/ori
INFO
: Table cevent01.join_original stats: [numFiles=1,
totalSize=121734744]
No rows affected (6.715 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000> load data local inpath '/opt/module/datas/nullid' into
table join_original_null_id;
INFO
: Loading data to table cevent01.join_original_null_id from
file:/opt/module/datas/nullid
INFO
: Table cevent01.join_original_null_id stats: [numFiles=1, totalSize=118645854]
No rows affected (13.959 seconds) 插入join表数据,全插
0:
jdbc:hive2://hadoop207.cevent.com:10000> insert
overwrite table join_original_table
0: jdbc:hive2://hadoop207.cevent.com:10000> select n.* from
0: jdbc:hive2://hadoop207.cevent.com:10000> join_original_null_id n left join join_original o on
n.id=o.id;
INFO : Stage-1 is selected by
condition resolver.
INFO : Number of reduce tasks
not specified. Estimated from input data size: 1
INFO : In order to change the
average load for a reducer (in bytes):
INFO : set
hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the
maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a
constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:2
INFO : Submitting tokens for
job: job_1591675698957_0002
INFO : The url to track the
job: http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0002/
INFO : Starting Job =
job_1591675698957_0002, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0002/
INFO : Kill Command =
/opt/module/hadoop-2.7.2/bin/hadoop job
-kill job_1591675698957_0002
INFO : Hadoop job information
for Stage-1: number of mappers: 2; number of reducers: 1
INFO : 2020-06-09 13:26:11,449
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 13:27:00,291
Stage-1 map = 100%, reduce = 0%,
Cumulative CPU 16.19 sec
INFO : 2020-06-09 13:27:21,791
Stage-1 map = 100%, reduce = 69%,
Cumulative CPU 20.71 sec
INFO : 2020-06-09 13:27:25,027
Stage-1 map = 100%, reduce = 77%,
Cumulative CPU 22.42 sec
INFO : 2020-06-09 13:27:27,828
Stage-1 map = 100%, reduce = 90%,
Cumulative CPU 22.42 sec
INFO : 2020-06-09 13:27:33,724
Stage-1 map = 100%, reduce = 100%, Cumulative
CPU 27.9 sec
INFO : MapReduce Total
cumulative CPU time: 27 seconds 900 msec
INFO : Ended Job =
job_1591675698957_0002
INFO : Loading data to table cevent01.join_original_table
from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_original_table/.hive-staging_hive_2020-06-09_13-26-01_145_7407125718177019072-1/-ext-10000
INFO : Table
cevent01.join_original_table stats: [numFiles=1, numRows=1000000,
totalSize=118645854, rawDataSize=117645854]
No rows affected (97.176 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> insert
overwrite table join_original_table 不为空查询
0:
jdbc:hive2://hadoop207.cevent.com:10000> select
n.* from
0:
jdbc:hive2://hadoop207.cevent.com:10000> (select *
from join_original_null_id where id is not null)
0:
jdbc:hive2://hadoop207.cevent.com:10000> n left
join join_original o on n.id=o.id;
INFO : Stage-1 is selected by
condition resolver.
INFO : Number of reduce tasks
not specified. Estimated from input data size: 1
INFO : In order to change the
average load for a reducer (in bytes):
INFO : set
hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the
maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant
number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:2
INFO : Submitting tokens for
job: job_1591675698957_0003
INFO : The url to track the
job: http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0003/
INFO : Starting Job = job_1591675698957_0003,
Tracking URL = http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0003/
INFO : Kill Command =
/opt/module/hadoop-2.7.2/bin/hadoop job
-kill job_1591675698957_0003
INFO : Hadoop job information
for Stage-1: number of mappers: 2; number of reducers: 1
INFO : 2020-06-09 13:29:56,277
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 13:30:25,427
Stage-1 map = 100%, reduce = 0%,
Cumulative CPU 12.49 sec
INFO : 2020-06-09 13:30:48,717
Stage-1 map = 100%, reduce = 100%, Cumulative
CPU 19.14 sec
INFO : MapReduce Total
cumulative CPU time: 19 seconds 140 msec
INFO : Ended Job =
job_1591675698957_0003
INFO : Loading data to table
cevent01.join_original_table from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_original_table/.hive-staging_hive_2020-06-09_13-29-44_221_5080146984779335117-1/-ext-10000
INFO : Table
cevent01.join_original_table stats: [numFiles=1, numRows=200001,
totalSize=24409561, rawDataSize=24209560]
No rows affected (67.163 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
mapreduce.job.reduces=3; 设置reduce的数量
No rows affected (0.226 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> insert
overwrite table join_original_table 只匹配有id的
0: jdbc:hive2://hadoop207.cevent.com:10000> select n.* from
0:
jdbc:hive2://hadoop207.cevent.com:10000> join_original_null_id
n left join join_original o on n.id=o.id;
INFO : Stage-1 is selected by
condition resolver.
INFO : Number of reduce tasks
not specified. Defaulting to jobconf value of: 3
INFO : In order to change the
average load for a reducer (in bytes):
INFO : set
hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the
maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a
constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:2
INFO : Submitting tokens for
job: job_1591675698957_0004
INFO : The url to track the job:
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0004/
INFO : Starting Job =
job_1591675698957_0004, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0004/
INFO : Kill Command =
/opt/module/hadoop-2.7.2/bin/hadoop job
-kill job_1591675698957_0004
INFO : Hadoop job information
for Stage-1: number of mappers: 2; number of reducers: 3
INFO : 2020-06-09 13:37:54,461
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 13:38:20,067
Stage-1 map = 50%, reduce = 0%,
Cumulative CPU 15.17 sec
INFO : 2020-06-09 13:38:25,554 Stage-1
map = 100%, reduce = 0%, Cumulative
CPU 18.33 sec
INFO : 2020-06-09 13:38:46,507
Stage-1 map = 100%, reduce = 50%,
Cumulative CPU 22.28 sec
INFO : 2020-06-09 13:39:12,015
Stage-1 map = 100%, reduce = 76%,
Cumulative CPU 31.73 sec
INFO : 2020-06-09 13:39:16,761
Stage-1 map = 100%, reduce = 100%,
Cumulative CPU 43.24 sec
INFO : MapReduce Total
cumulative CPU time: 43 seconds 240 msec
INFO : Ended Job = job_1591675698957_0004
INFO : Loading data to table
cevent01.join_original_table from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_original_table/.hive-staging_hive_2020-06-09_13-37-42_754_4173942603345328108-1/-ext-10000
INFO : Table
cevent01.join_original_table stats: [numFiles=3, numRows=1000000,
totalSize=118645854, rawDataSize=117645854]
No rows affected (110.936 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> insert
overwrite table join_original_table 设置空key为随机数
0: jdbc:hive2://hadoop207.cevent.com:10000>
select n.* from
0: jdbc:hive2://hadoop207.cevent.com:10000>
join_original_null_id n full join join_original o on
0:
jdbc:hive2://hadoop207.cevent.com:10000> case when
n.id is null
0: jdbc:hive2://hadoop207.cevent.com:10000>
then concat('hive',rand())
0: jdbc:hive2://hadoop207.cevent.com:10000>
else n.id end=o.id;
INFO : Number of reduce tasks
not specified. Defaulting to jobconf value of: 3
INFO : In order to change the
average load for a reducer (in bytes):
INFO : set
hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the
maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a
constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:2
INFO : Submitting tokens for
job: job_1591675698957_0005
INFO : The url to track the
job: http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0005/
INFO : Starting Job =
job_1591675698957_0005, Tracking URL = http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0005/
INFO : Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop
job -kill job_1591675698957_0005
INFO : Hadoop job information
for Stage-1: number of mappers: 2; number of reducers: 3
INFO : 2020-06-09 13:43:57,701
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 13:44:18,174
Stage-1 map = 33%, reduce = 0%,
Cumulative CPU 12.1 sec
INFO : 2020-06-09 13:44:19,621
Stage-1 map = 50%, reduce = 0%,
Cumulative CPU 14.39 sec
INFO : 2020-06-09 13:44:26,173
Stage-1 map = 84%, reduce = 0%,
Cumulative CPU 24.9 sec
INFO : 2020-06-09 13:44:30,664
Stage-1 map = 100%, reduce = 0%,
Cumulative CPU 26.35 sec
INFO : 2020-06-09 13:44:45,291
Stage-1 map = 100%, reduce = 28%,
Cumulative CPU 28.45 sec
INFO : 2020-06-09 13:44:52,047
Stage-1 map = 100%, reduce = 56%,
Cumulative CPU 31.54 sec
INFO : 2020-06-09 13:45:03,169
Stage-1 map = 100%, reduce = 76%,
Cumulative CPU 37.31 sec
INFO : 2020-06-09 13:45:04,818
Stage-1 map = 100%, reduce = 89%,
Cumulative CPU 42.86 sec
INFO : 2020-06-09 13:45:12,375
Stage-1 map = 100%, reduce = 99%,
Cumulative CPU 53.09 sec
INFO : 2020-06-09 13:45:13,851
Stage-1 map = 100%, reduce = 100%,
Cumulative CPU 53.26 sec
INFO : MapReduce Total
cumulative CPU time: 53 seconds 260 msec
INFO : Ended Job =
job_1591675698957_0005
INFO : Loading data to table
cevent01.join_original_table from
hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_original_table/.hive-staging_hive_2020-06-09_13-43-42_102_6854456494005290452-1/-ext-10000
INFO : Table
cevent01.join_original_table stats: [numFiles=3, numRows=1799999,
totalSize=135445833, rawDataSize=133645834]
No rows affected (97.97 seconds)
8.链接测试
链接测试cluster:http://hadoop207.cevent.com:8088/cluster
9.动态分区
0: jdbc:hive2://hadoop207.cevent.com:10000>
select b.id from big_join b 查询id<=10的数据
0:
jdbc:hive2://hadoop207.cevent.com:10000> join
(select id from join_original where id<=10) o
0:
jdbc:hive2://hadoop207.cevent.com:10000> on
b.id=o.id;
INFO
: Stage-1 is selected by condition resolver.
INFO
: Number of reduce tasks not specified. Estimated from input data
size: 1
INFO
: In order to change the average load for a reducer (in bytes):
INFO
: set
hive.exec.reducers.bytes.per.reducer=<number>
INFO
: In order to limit the maximum number of reducers:
INFO :
set hive.exec.reducers.max=<number>
INFO
: In order to set a constant number of reducers:
INFO
: set mapreduce.job.reduces=<number>
INFO
: number of splits:2
INFO
: Submitting tokens for job: job_1591675698957_0014
INFO
: The url to track the job:
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0014/
INFO
: Starting Job = job_1591675698957_0014, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0014/
INFO
: Kill Command = /opt/module/hadoop-2.7.2/bin/hadoop job -kill job_1591675698957_0014
INFO
: Hadoop job information for Stage-1: number of mappers: 2; number of
reducers: 1
INFO
: 2020-06-09 17:00:28,898 Stage-1 map = 0%, reduce = 0%
INFO
: 2020-06-09 17:00:49,450 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.22 sec
INFO
: 2020-06-09 17:00:51,358 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 10.76 sec
INFO
: 2020-06-09 17:01:04,748 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 14.9 sec
INFO
: MapReduce Total cumulative CPU time: 14 seconds 900 msec
INFO
: Ended Job = job_1591675698957_0014
+-------+--+
| b.id |
+-------+--+
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 1 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 2 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 3 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 4 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 5 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 6 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 7 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 8 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 9 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
+-------+--+
100 rows selected (48.722 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.exec.dynamic.partition=true; 开启动态分区功能
No rows affected (0.018 seconds)
0: jdbc:hive2://hadoop207.cevent.com:10000>
set hive.exec.dynamic.partition.mode=nonstrict; 设置为非严格模式
No rows affected (0.587 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.exec.max.dynamic.partitions=1000;
所有maoreduce节点上最大创建的动态分区数量1000
No rows affected (0.007 seconds) 在执行的每个mapreduce中,最大可以创建100个动态分区
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.exec.max.dynamic.partitions.pernode=100;
No rows affected (0.009 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set
hive.exec.max.created.files=100000;
最大可以创建多少个HDFS文件,默认为10万
No rows affected (0.006 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> set hive.error.on.empty.partition=false;
空分区生成时,是否抛出异常
No rows affected (0.007 seconds) 创建分区大表
0: jdbc:hive2://hadoop207.cevent.com:10000>
create table join_dep_partition(
0:
jdbc:hive2://hadoop207.cevent.com:10000> id int,
0:
jdbc:hive2://hadoop207.cevent.com:10000> name
string
0:
jdbc:hive2://hadoop207.cevent.com:10000> )
0: jdbc:hive2://hadoop207.cevent.com:10000>
partitioned by (location int) row format delimited
fields terminated by '\t';
No rows affected (2.525 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> insert
into table join_dep_partition 插入查询结果,自动根据类型匹配
0:
jdbc:hive2://hadoop207.cevent.com:10000> partition(location)
0: jdbc:hive2://hadoop207.cevent.com:10000>
select deptno,dname,locations from c_dept;
INFO : Number of reduce tasks
is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for
job: job_1591675698957_0015
INFO : The url to track the
job: http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0015/
INFO : Starting Job =
job_1591675698957_0015, Tracking URL =
http://hadoop207.cevent.com:8088/proxy/application_1591675698957_0015/
INFO : Kill Command =
/opt/module/hadoop-2.7.2/bin/hadoop job
-kill job_1591675698957_0015
INFO : Hadoop job information
for Stage-1: number of mappers: 1; number of reducers: 0
INFO : 2020-06-09 18:18:43,415
Stage-1 map = 0%, reduce = 0%
INFO : 2020-06-09 18:18:53,201
Stage-1 map = 100%, reduce = 0%,
Cumulative CPU 1.95 sec
INFO : MapReduce Total
cumulative CPU time: 1 seconds 950 msec
INFO : Ended Job =
job_1591675698957_0015
INFO : Stage-4 is selected by
condition resolver.
INFO : Stage-3 is filtered out
by condition resolver.
INFO : Stage-5 is filtered out
by condition resolver.
INFO : Moving data to:
hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_dep_partition/.hive-staging_hive_2020-06-09_18-18-27_297_470948791343936680-7/-ext-10000
from hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_dep_partition/.hive-staging_hive_2020-06-09_18-18-27_297_470948791343936680-7/-ext-10002
INFO : Loading data to table
cevent01.join_dep_partition partition (location=null) from
hdfs://hadoop207.cevent.com:8020/user/hive/warehouse/cevent01.db/join_dep_partition/.hive-staging_hive_2020-06-09_18-18-27_297_470948791343936680-7/-ext-10000
INFO : Time taken for load dynamic
partitions : 2375
INFO : Loading partition {location=1800}
INFO : Loading partition {location=1900}
INFO : Loading partition {location=1700}
INFO : Time taken for adding to write
entity : 2
INFO : Partition
cevent01.join_dep_partition{location=1700} stats: [numFiles=1, numRows=2,
totalSize=28, rawDataSize=26]
INFO : Partition
cevent01.join_dep_partition{location=1800} stats: [numFiles=1, numRows=1,
totalSize=12, rawDataSize=11]
INFO : Partition
cevent01.join_dep_partition{location=1900} stats: [numFiles=1, numRows=1,
totalSize=9, rawDataSize=8]
No rows affected (32.8 seconds)
0:
jdbc:hive2://hadoop207.cevent.com:10000> show
partitions join_dep_partition; 查看分区
+----------------+--+
| partition |
+----------------+--+
| location=1700 |
| location=1800 |
| location=1900 |
+----------------+--+
3 rows selected (0.502 seconds)
10.链接校验
http://hadoop207.cevent.com:50070/explorer.html#/user/hive/warehouse/cevent01.db