hive基础用法

最新推荐文章于 2023-04-23 23:32:14 发布

遥遥晚风点点

最新推荐文章于 2023-04-23 23:32:14 发布

阅读量180

点赞数

文章标签： hive

本文链接：https://blog.csdn.net/Mr_ye931/article/details/106247335

版权

1，内部表和外部表

内部表：hive（HDFS系统中）中的表：

create table t_2(id int,name string,salary bigint,add string)
row format delimited
fields terminated by ',';

外部表：系统磁盘中文件映射表

create table t_2(id int,name string,salary bigint,add string)
row format delimited
fields terminated by ',';

区别：内部表的目录由hive创建在默认的仓库目录下：/user/hive/warehouse/....
       外部表的目录由用户建表时自己指定： location '/位置/'

       drop一个内部表时，表的元信息和表数据目录都会被删除；
       drop一个外部表时，只删除表的元信息，表的数据目录不会删除；

意义：通常，一个数据仓库系统，数据总有一个源头，而源头一般是别的应用系统产生的，
其目录无定法，为了方便映射，就可以在hive中用外部表进行映射；并且，就算在hive中把
这个表给drop掉，也不会删除数据目录，也就不会影响到别的应用系统；

2，分区关键字 PARTITIONED BY

create table t_4(ip string,url string,staylong int)
partitioned by (day string) -- 分区标识不能存在于表字段中
row format delimited
fields terminated by ',';

导入数据到不同的分区目录：

load data local inpath '/root/weblog.1' into table t_4 partition(day='2017-04-08');
load data local inpath '/root/weblog.2' into table t_4 partition(day='2017-04-09');

3，导入数据

3.1 将hive运行所在机器的本地磁盘上的文件导入表中

load data local inpath '/root/weblog.1' into[overwrite] table t_1;

3.2 将hdfs中的文件导入表中

load data inpath '/user.data.2' into table t_1;

不加local关键字，则是从hdfs的路径中移动文件到表目录中；

3.3 从别的表查询数据后插入到一张新建表中

create table t_1_jz
as
select id,name from t_1;

3.4 从别的表查询数据后插入到一张已存在的表中
加入已存在一张表：可以先建好：

  create table t_1_hd like t_1;

然后从t_1中查询一些数据出来插入到t_1_hd中：

insert into table t_1_hd
select
id,name,add
from t_1
where add='handong';

遥遥晚风点点

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hive基础用法

1，内部表和外部表内部表：hive（HDFS系统中）中的表：create table t_2(id int,name string,salary bigint,add string)row format delimitedfields terminated by ',';外部表：系统磁盘中文件映射表create table t_2(id int,name string,salary bigint,add string)row format delimitedfields ter
复制链接

扫一扫