Hive复制分区表结构以及表数据

最新推荐文章于 2022-06-14 14:48:09 发布

000X000

最新推荐文章于 2022-06-14 14:48:09 发布

阅读量1.5k

点赞数

分类专栏： BIGDATA HIVE 文章标签： Hive

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/ytp552200ytp/article/details/90697122

版权

HIVE 同时被 2 个专栏收录

39 篇文章 3 订阅

订阅专栏

16 篇文章 1 订阅

订阅专栏

目录

1、创建一张表，即将要被复制的表old_table：

2、向old_table中添加数据，新建文本文件data.txt，向其中添加三行数据，建表时的分隔符为逗号，所以我们用逗号分隔：

3、使用load命令向表中添加数据：

4、复制该表结构到新表，即new_table:

5、然后将使用命令hadoop fs -cp旧表数据复制到新表的hdfs目录下：

6、使用命令 MSCK REPAIR TABLE new_table刷新原数据信息：

7、然后查询可看到new_table的数据：

1、创建一张表，即将要被复制的表old_table：

hive (default)> create table old_table(age bigint,height string,weight string) partitioned by(p_month int,p_day int,p_hour int)row format delimited fields terminated by ','  STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
OK
Time taken: 0.429 seconds
hive (default)> desc old_table;
OK
col_name        data_type       comment
age                     bigint                                      
height                  string                                      
weight                  string                                      
p_month                 int                                         
p_day                   int                                         
p_hour                  int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
p_month                 int                                         
p_day                   int                                         
p_hour                  int                                         
Time taken: 0.112 seconds, Fetched: 13 row(s)

2、向old_table中添加数据，新建文本文件data.txt，向其中添加三行数据，建表时的分隔符为逗号，所以我们用逗号分隔：

25,170,70
24,175,65
27,180,80

3、使用load命令向表中添加数据：

hive (default)> LOAD DATA LOCAL INPATH '/home/hadoop/temp/data.txt' OVERWRITE INTO TABLE old_table PARTITION (p_month='201609',p_day='20160908',p_hour='2016090800');
Loading data to table default.old_table partition (p_month=201609, p_day=20160908, p_hour=2016090800)
OK
Time taken: 3.424 seconds
hive (default)> show partitions old_table;
OK
partition
p_month=201609/p_day=20160908/p_hour=2016090800
Time taken: 0.202 seconds, Fetched: 1 row(s)
hive (default)> select * from old_table;
OK
old_table.age   old_table.height        old_table.weight        old_table.p_month       old_table.p_day old_table.p_hour
25      170     70      201609  20160908        2016090800
24      175     65      201609  20160908        2016090800
27      180     80      201609  20160908        2016090800
Time taken: 0.099 seconds, Fetched: 3 row(s)

4、复制该表结构到新表，即new_table:

hive (default)> create table new_table like old_table;
OK
Time taken: 0.099 seconds
hive (default)> desc new_table;
OK
col_name        data_type       comment
age                     bigint                                      
height                  string                                      
weight                  string                                      
p_month                 int                                         
p_day                   int                                         
p_hour                  int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
p_month                 int                                         
p_day                   int                                         
p_hour                  int                                         
Time taken: 0.093 seconds, Fetched: 13 row(s)
hive (default)> select * from new_table limit 10;
OK
new_table.age   new_table.height        new_table.weight        new_table.p_month       new_table.p_day new_table.p_hour
Time taken: 0.566 seconds
hive (default)> show partitions new_table;
OK
partition
Time taken: 0.063 seconds

5、然后将使用命令hadoop fs -cp旧表数据复制到新表的hdfs目录下：

[hadoop@node1 ~]$ hadoop fs -cp /user/hive/warehouse/old_table/*  /user/hive/warehouse/new_table/

6、使用命令 MSCK REPAIR TABLE new_table刷新原数据信息：

hive (default)> MSCK REPAIR TABLE new_table;
OK
Partitions not in metastore:    new_table:p_month=201609/p_day=20160908/p_hour=2016090800
Repair: Added partition to metastore new_table:p_month=201609/p_day=20160908/p_hour=2016090800
Time taken: 0.447 seconds, Fetched: 2 row(s)

7、然后查询可看到new_table的数据：

hive (default)> select * from new_table;
OK
new_table.age   new_table.height        new_table.weight        new_table.p_month       new_table.p_day new_table.p_hour
25      170     70      201609  20160908        2016090800
24      175     65      201609  20160908        2016090800
27      180     80      201609  20160908        2016090800
Time taken: 0.5 seconds, Fetched: 3 row(s)

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Hive复制分区表结构以及表数据

目录1、创建一张表，即将要被复制的表old_table：2、向old_table中添加数据，新建文本文件data.txt，向其中添加三行数据，建表时的分隔符为逗号，所以我们用逗号分隔：3、使用load命令向表中添加数据：4、复制该表结构到新表，即new_table:5、然后将使用命令hadoop fs -cp旧表数据复制到新表的hdfs目录下：6、使用命令 MSCK ...
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。