1.更改表分区
(1)为已有表添加分区,首先为外部表创建一个目录,并且在hdfs上创建2个分区
alter table add partition
hadoop fs -mkdir /user/demo/ids
hadoop fs -mkdir /user/demo/ids/2016-05-31
hadoop fs -mkdir /user/demo/ids/2016-05-30
将数据复制到这些目录下
hadoop fs -put /tmp/2016-05-31.txt /user/demo/ids/2016-05-31/
hadoop fs -put /tmp/2016-05-30.txt /user/demo/ids/2016-05-30/
创建外部表并且为其添加分区
create external table ids(a int) partitioned by (datestamp string) location’/user/demo/ids’;
为表添加分区
alter table ids add partition (datestamp=‘2016-05-30’) location ‘/user/demo/ids/2016-05-30/’;
hive>select * from ids;
11 2016-05-30
12 2016-05-30
13 2016-05-30
14 2016-05-30
alter table ids add partition (datestamp=‘2016-05-31’) location ‘/user/demo/ids/2016-05-31/’;
hive>select * from ids;
11 2016-05-30
12 2016-05-30
13 2016-05-30
14 2016-05-30
1 2016-05-31
2 2016-05-31
3 2016-05-31
4 2016-05-31
(2)为内部表添加分区
创建内部表
msck pepair table
create table ids_internal (a int) partitioned by (datestamp string);
为2个不同的分区插入几行数据
insert into ids_internal partition (datestamp = ‘2016-05-30’) value (1);
insert into ids_internal partition (datestamp = ‘2016-05-3’) value (11);
show partitions ids_internal;
datestamp=2016-05-30
datestamp=2016-05-31
我们将在该表的目录下创建一个新的子目录并且为其添加一个文件
hadoop fs -mkdir /apps/hive/warehouse/ids_internal/datestamp=2016-05-21
hadoop fs -put /tmp/2016-05-21.txt /apps/hive/warehouse/ids_internal/datestamp
=2016-05-21
现在运行msck pepair table命令来为该表添加新分区
msck pepair table ids_internal;
show partitions ids_internal;
datestamp=2016-05-21
datestamp=2016-05-30
datestamp=2016-05-31
msck pepair 命令为ids_internal表检查/apps/hive/warehouse/ids_internal下的子目录,
而且因为它找到了一个名为datestamp=2016-05-21的新目录,所以它会将该子目录作为
一个新分区添加到ids_internal表。当你添加了很多新的分区目录,并且想要一次性全部更新
它们的表定义时,这种方法尤其有用。注意:这种方式仅对内部表有效。