1 hive版本
2.1.1版本(其实版本不是很重要,其他低一些的版本也是有这个命令的,即同样参考本文章操作) :
2 问题呈现
HDFS上有数据:
如下shell呈现:
[hive@cdh-test-01 ~]$ hadoop fs -cat /user/hive/warehouse/test.db/t_psn_par/dt=20210318/*
1;Jay;1995;Xizang
2;Lily;1991;Shanghai
3;Jack;1992;Guangxi
4;Jenny;1999;Xinjiang
5;Jay;1995;Xizang
6;Tom;1990;Beijing
7;Lily;1991;Shanghai
8;Jack;1992;Guangxi
9;Jenny;1999;Xinjiang
[hive@cdh-test-01 ~]$ hadoop fs -cat /user/hive/warehouse/test.db/t_psn_par/dt=20210317/*
31;Lily;1991;Shanghai31
32;Jack;1992;Guangxi32
33;Jenny;1999;Xinjiang33
1;Jay;1995;Xizang
2;Lily;1991;Shanghai
3;Jack;1992;Guangxi
4;Jenny;1999;Xinjiang
5;Jay;1995;Xizang
6;Tom;1990;Beijing
7;Lily;1991;Shanghai
8;Jack;1992;Guangxi
9;Jenny;1999;Xinjiang
[hive@cdh-test-01 ~]$ hadoop fs -cat /user/hive/warehouse/test.db/t_psn_par/dt=20210317/t_psn
1;Jay;1995;Xizang
2;Lily;1991;Shanghai
3;Jack;1992;Guangxi
4;Jenny;1999;Xinjiang
5;Jay;1995;Xizang
6;Tom;1990;Beijing
7;Lily;1991;Shanghai
8;Jack;1992;Guangxi
9;Jenny;1999;Xinjiang
[hive@cdh-test-01 ~]$ $ hadoop fs -ls /user/hive/warehouse/test.db/t_psn_par/dt=20210317/
-bash: $: 未找到命令
[hive@cdh-test-01 ~]$ hdfs dfs -ls /user/hive/warehouse/test.db/t_psn_par/dt=20210317/
Found 2 items
-rwxrwxr-t 3 hive hive 72 2021-03-19 17:12 /user/hive/warehouse/test.db/t_psn_par/dt=20210317/20210319_1
-rwxrwxr-t 3 hive hive 182 2021-03-19 16:38 /user/hive/warehouse/test.db/t_psn_par/dt=20210317/t_psn
[hive@cdh-test-01 ~]$ hadoop fs -cat /user/hive/warehouse/test.db/t_psn_par/dt=20210317/20210319_1
31;Lily;1991;Shanghai31
32;Jack;1992;Guangxi32
33;Jenny;1999;Xinjiang33
[hive@cdh-test-01 ~]$
同样在hive的表中也确定load数据了:
create external table if not exists t_psn_par (
seq int comment '客户编号,主键',
name string comment '姓名',
year int comment '年龄',
city string comment '城市') partitioned by (dt string)
row format delimited
fields terminated by ';'
LOAD DATA INPATH '/user/hive/warehouse/test.db/t_psn/20210319/t_psn' OVERWRITE INTO TABLE t_psn_par PARTITION(dt=20210318);
LOAD DATA INPATH '/user/hive/warehouse/test.db/t_psn/20210319_1' OVERWRITE INTO TABLE t_psn_par PARTITION(dt='20210317'); #### 小细节:dt=''中加上符号''加不加 不影响
问题出现:
hive (test)> select * from t_psn_par; ###因为dt=20210318分区的数据实际上没有load成功,只能呈现
#### dt=20210317分区的这些数据
OK
t_psn_par.seq t_psn_par.name t_psn_par.year t_psn_par.city t_psn_par.dt
31 Lily 1991 Shanghai31 20210317
32 Jack 1992 Guangxi32 20210317
33 Jenny 1999 Xinjiang33 20210317
1 Jay 1995 Xizang 20210317
2 Lily 1991 Shanghai 20210317
3 Jack 1992 Guangxi 20210317
4 Jenny 1999 Xinjiang 20210317
5 Jay 1995 Xizang 20210317
6 Tom 1990 Beijing 20210317
7 Lily 1991 Shanghai 20210317
8 Jack 1992 Guangxi 20210317
9 Jenny 1999 Xinjiang 20210317
NULL NULL NULL NULL 20210317
Time taken: 0.164 seconds, Fetched: 13 row(s)
hive (test)> select * from t_psn_par where dt='20210317'; ###指定分区的值
OK
t_psn_par.seq t_psn_par.name t_psn_par.year t_psn_par.city t_psn_par.dt
31 Lily 1991 Shanghai31 20210317
32 Jack 1992 Guangxi32 20210317
33 Jenny 1999 Xinjiang33 20210317
1 Jay 1995 Xizang 20210317
2 Lily 1991 Shanghai 20210317
3 Jack 1992 Guangxi 20210317
4 Jenny 1999 Xinjiang 20210317
5 Jay 1995 Xizang 20210317
6 Tom 1990 Beijing 20210317
7 Lily 1991 Shanghai 20210317
8 Jack 1992 Guangxi 20210317
9 Jenny 1999 Xinjiang 20210317
NULL NULL NULL NULL 20210317
Time taken: 0.424 seconds, Fetched: 13 row(s)
hive (test)> select * from t_psn_par where dt='20210318'; ###指定分区的值 没有数据 ??问题出来了!!!
OK
t_psn_par.seq t_psn_par.name t_psn_par.year t_psn_par.city t_psn_par.dt
Time taken: 0.151 seconds
3 问题解决
手动解决:hive (test)> msck repair table t_psn_par ;
hive (test)> msck repair table t_psn_par ; ### 手动修复 !!!
OK
Partitions not in metastore: t_psn_par:dt=20210318
Repair: Added partition to metastore t_psn_par:dt=20210318
Time taken: 0.211 seconds, Fetched: 2 row(s)
hive (test)> select * from t_psn_par where dt='20210318' ; ###查询之前没有数据的分区,成功!
OK
t_psn_par.seq t_psn_par.name t_psn_par.year t_psn_par.city t_psn_par.dt
1 Jay 1995 Xizang 20210318
2 Lily 1991 Shanghai 20210318
3 Jack 1992 Guangxi 20210318
4 Jenny 1999 Xinjiang 20210318
5 Jay 1995 Xizang 20210318
6 Tom 1990 Beijing 20210318
7 Lily 1991 Shanghai 20210318
8 Jack 1992 Guangxi 20210318
9 Jenny 1999 Xinjiang 20210318
NULL NULL NULL NULL 20210318
Time taken: 0.208 seconds, Fetched: 10 row(s)