Hive对库对表的操作

weixin_53762943

已于 2022-11-23 15:22:50 修改

阅读量644

点赞数

文章标签： hive hadoop 大数据

于 2022-10-21 16:31:50 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_53762943/article/details/127447942

版权

本文介绍了Hive的库、表操作，包括创建、删除、切换数据库，内部表与外部表的创建、导入导出，以及复杂数据类型的处理。还深入探讨了Hive的分区表和分桶概念，并展示了不同类型的JOIN操作。内容覆盖Hive的日常管理和使用。

摘要由CSDN通过智能技术生成

目录

1.Hive对库的操作

2.Hive对表的操作

3.Hive的分区表

前期工作

需提前启动服务端(hiveserver2)和客户端(beeline -u 'jdbc:hive2://192.168.67.110:10000' -n root),即可实现hive的操作。

1.Hive对库的操作

1)show databases;

2)create database t1;

3)show databases;

4)创建带属性的库并查看数据库的详细信息

创建：create database if not exists t3 with dbproperties('creator'='hadoop','date'='2019-01-01');

查看：desc database extended t3;

5)删除库

drop database if exists t3 cascade;

6)切换数据库并查看当前正在使用哪个数据库

use t2;

2.Hive对表的操作

内部表：

1)创建内部表 ,并在hdfs上找到内部表的存储路径。

create table worker_1(id int,name string,salary bigint,addr string)

row format delimited

fields terminated by ‘,’;

外部表：

2)创建外部表，在hdfs上的指定路径查询外部表时发现，只是一张空表，没有任何数据

create external table worker_2(id int,name string,salary bigint,addr string)

row format delimited

fields terminated by ‘,’

location ‘/worker’;

导入：

3)将本地磁盘上的文件导入表中。

Load data local inpath ‘/opt/testData/hive/worker_1.txt’ overwrite into table worker_1;

4)在h dfs上将文件导入数据表中

load data inpath ‘/worker/worker_1.txt’ into table worker-2;

5)从别的表查询数据后插入到一张新建的表中,自动生成数据表。

6)从别的表查询数据后插入到一张已经存在的表中，已经存在此表。

导出：

7)将数据从hive的表中导出到hdfs的目录中

8)将数据从hive的表中导出到本地磁盘的目录中。

9)Hive的复杂数据类型：

创建数据表：

create table movie_info(

id int,

name string,

work_location array<string>,

piaofang map<string,bigint>,

address struct<location:string,zipcode:int,phone:string,value:int>)

row format delimited

fields terminated by " "

collection items terminated by ","

map keys terminated by ":" ;

导入数据：

load data local inpath "/usr/datadir/movie_1.txt" into table movie_info;

查询语句：

array：select work_location[0] from movie_info;

Map：map：select piaofang["a1"] from movie_info;

struct：select address.location from movie_info;

10)Hive文件存储格式

创建seq表，对应的文件类型是sequencefile。

将查询出来的数据直接使用sequencefile保存。

将查询出来的数据直接使用orc保存。

将查询出来的数据直接使用parquet保存。

11)查看表信息

新建表：

create table student(id int,name string,age int)

row format delimited

fields terminated by “,”;

查看表信息。

desc student;

查看表的详细信息。

desc extended student;

查看表的详细建表语句。

show create table student;

12)修改表

修改表名alter table student rename to new_student;

13)删除表

14)清空表

truncate table student;

3.Hive的分区表

1)分区表的创建

分区就是表目录中的一个子目录。

建表

create table worker_4(id int,name string,salary bigint,addr string)

partitioned by (day string)

row format delimited

fields terminated by ‘,’;

2)导入数据到分区

load data local inpath '/opt/testData/hive/worker_1.txt' into table worker_4 partition(day='01');

load data local inpath '/opt/testData/hive/worker_1.txt' into table worker_4 partition(day='02');

3)增删分区

查看分区信息。

show partitions worker_4;

增加分区：

alter table worker_4 add partition(day='03') partition(day='04');

通过加载数据实现添加分区：

load data local inpath '/opt/testData/hive/worker_3.txt' into table worker_4 partition(day='05');

4.Hive的分桶操作：

1)向桶中插入数据

2)查看存储信息

3)查看分桶数据

Hive的表关联操作：

Join操作：

数据准备：

创建表，导入数据：

create table t_order(

orderid int,

name string

)

row format delimited

fields terminated by ",";

load data local inpath '/usr/datadir/order.txt' into table t_order;

create table t_goods(

goodid int,

price int

)

row format delimited

fields terminated by ",";

load data local inpath '/usr/datadir/goods.txt' into table t_goods;

内连接：inner join

左外连接：

select *

from t_order left join t_goods

on orderid = goodid;

右外连接：

select *

from t_order right join t_goods

on orderid = goodid;

全外连接：

select *

from t_order full join t_goods

on orderid = goodid;

weixin_53762943

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。