目录
1.什么是分区表
分区表就是将一个大表分成若干个小表。分区表分的是文件夹
2.创建数据
[root@hadoop dool]# vim data.txt #创建文本
for i in $(seq 10) #输入这个脚本,会自动生成数据
do
echo -e "$(date -d "$RANDOM minute ago" +%F' '%T'.'%N)"
done|sort -nk1|awk '{print NR"\t"$0}'>test.txt
[root@hadoop dool]# sh data.txt #运行文本会产生数据
[root@hadoop dool]# ls
data.txt test.txt
[root@hadoop dool]# cat test.txt
1 2021-11-02 16:28:39.931093193
2 2021-11-03 15:19:39.933868815
3 2021-11-04 17:14:39.929646653
4 2021-11-05 11:04:39.934388378
5 2021-11-06 09:02:39.932742132
6 2021-11-07 01:47:39.931626788
7 2021-11-07 13:52:39.934925865
8 2021-11-12 01:06:39.930461977
9 2021-11-20 13:16:39.932218685
10 2021-11-20 16:10:39.933310604
3.创建静态表
0: jdbc:hive2://192.168.171.151:10000> create table orders #创建静态表
. . . . . . . . . . . . . . . . . . > (id int,ord_num string)
. . . . . . . . . . . . . . . . . . > partitioned by(month string)
. . . . . . . . . . . . . . . . . . > row format delimited fields
. . . . . . . . . . . . . . . . . . > terminated by '\t';
No rows affected (0.117 seconds)
0: jdbc:hive2://192.168.171.151:10000> show tables; #查看所有表
+---------------+--+
| tab_name |
+---------------+--+
| bucket |
| dept |
| orders |
| student |
| student_ext |
| student_info |
| student_ptn |
+---------------+--+
7 rows selected (0.057 seconds)
4.导入数据
0: jdbc:hive2://192.168.171.151:10000> load data local inpath '/usr/word/dool/test.txt'
. . . . . . . . . . . . . . . . . . > into table orders partition(month='2022-03');
No rows affected (0.624 seconds)
5.创建动态分区表结构
0: jdbc:hive2://192.168.171.151:10000> create table dynamic_table(
. . . . . . . . . . . . . . . . . . > ord_num int
. . . . . . . . . . . . . . . . . . > )
. . . . . . . . . . . . . . . . . . > partitioned by(ord_date string)
. . . . . . . . . . . . . . . . . . > row format delimited fields terminated
. . . . . . . . . . . . . . . . . . > by '\t';
No rows affected (0.106 seconds)
6.开启动态分区
0: jdbc:hive2://192.168.171.151:10000> set hive.exec.dynamic.partition=true; #使用动态分区
No rows affected (0.027 seconds)
0: jdbc:hive2://192.168.171.151:10000> set hive.exec.dynamic.partition.mode=nonstrict; #无限制模式
No rows affected (0.005 seconds)
7.把刚才静态分区的数据导入到静态分区
用ord_date自动分区
0: jdbc:hive2://192.168.171.151:10000> insert into table dynamic_table
. . . . . . . . . . . . . . . . . . > partition(ord_date)
. . . . . . . . . . . . . . . . . . > select id,ord_num from orders;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
No rows affected (19.511 seconds)
端口10002上有显示运行中
查看wed界面动态分区成功