Hive的基本语法

最新推荐文章于 2024-08-07 10:49:21 发布

新东方毕业生

最新推荐文章于 2024-08-07 10:49:21 发布

阅读量466

点赞数 1

分类专栏： Hive 文章标签： hive

本文链接：https://blog.csdn.net/pengfei1410/article/details/104674538

版权

本文详细介绍了 Hive 的基本语法，包括 DDL（数据定义语言）的库和表操作，如创建、查看、删除和修改表，以及 DML（数据管理语言）的数据导入、分区表的操作。还探讨了 Hive 的 DQL（数据查询语言），如连接、过滤、分组、排序和数据导出等概念。重点讲解了分区表和分桶表的数据加载，并对比了静态和动态分区的差异。

摘要由CSDN通过智能技术生成

DDL

DDL：data-define-lauguage 数据定义语言

库操作

建库：create database if not exists dbname;

切换库 ：use dbname;

查询正在使用的库：select current_database();

查询库列表

查询所有数据库：show databases;
模糊查询：show databases like "*test";
查询库的详细描述信息：desc database dbname;

删除数据库

删除空数据库：drop database if exists dbname;
级联删除：drop database dbname cascade;

表操作

创建表

建表语句

create [external] table [if not exists] table_name
 [(col_name data_type [comment col_comment], ...)]
 [comment table_comment]
 [partitioned by (col_name data_type [comment col_comment], ...)]
 [clustered by (col_name, col_name, ...)
 [sorted by (col_name [asc|desc], ...)] into num_buckets buckets]
 [row format row_format]
 [stored as file_format]
 [location hdfs_path]

external ：外部表关键字，不加时创建的是内部表
if not exists ：防止报错，当表不存在时创建表
comment ：指定字段或表的描述信息
partitioned by（col_name data_type 字段名与类型） ：指定分区表的分区字段，注意分区字段一定不能在建表字段中，分区字段是单独存储的
clustered by (col_name, col_name, …) [sorted by (col_name [asc|desc], …)] into num_buckets buckets]：指定分桶，clustered by 指定分桶标志，into num_buckets buckets 指定分桶个数，sorted by指定桶中的排序规则，注意分桶字段一定是建表字段中的某一个或某几个
row format row_format ：指定行格式化，指定字段分割符（指的是存储文件的分隔符而不是加载文件的分隔符，加载文件只是把文件复制到hdfs的指定路径下，当使用插入方式或通过hbase表解析出来是存储的文件中的字段是按照指定分割符分割的）

delimited lines terminated by \n 行分割符
delimited fields terminated by \001 字段分割符

stored as file_format ：指定表数据的存储格式

textfile ：默认的存储格式
SequenceFile ：二进制的存储格式
RCFile ：行列结合的形式存储
ORCFile：以hanlie结合的形式存储，是RCFile存储格式的优化

location ：指定存储位置，如果不指定是默认存储在 /user/hive/warehouse下

示例：创建表来存储学生的信息（编号 int，姓名 string，性别 string，年龄 int，部门 string）

创建内部表

create table if not exists stu_managed(sid int,name string,sex string,age int,dept string) row format delimited fields terminated by ",";

创建外部表

create external table if not exists stu_external(sid int,name string,sex string,age int,dept string) row format delimited fields terminated by ",";

创建分区表

分区字段 dept

create table if not exists stu_ptn_dept(sid int,name string,sex string,age int) partitioned by (dept string) row format delimited fields terminated by "," location "/user/hivedata/stu_ptn";

分区字段 sex

create table if not exists stu_ptn_sex(sid int,name string,age int,dept string) partitioned by (sex string) row format delimited fields terminated by ",";

创建分桶表：分桶字段 age 分桶个数 3

create table if not exists stu_buk(sid int,name string,sex string,age int,dept string) clustered by (age) sorted by(sid) into 3 buckets row format delimited fields terminated by ",";

表复制：create table tbname like tbname1;仅复制表结构，不复制表数据和属性

create external table stu_copy like stu_buk;

ctas建表：create table tbname as select ....将查询结果存储在一个表中，使用ctas建表时默认的字段名称为_c0,_c1…

create table tbname as select * from stu_managed;

查看表

查看表信息：desc [extended] [formatted] 库,表,分区,视图...
查看表的字段信息：desc tbname;
查看表的扩展信息：desc extended tbname;
格式化显示表的详细信息：desc formatted tbname;
查看表列表：

查看当前数据库下的表：show tables;
查看指定数据库下的表：show tables in dbname;

查看建表语句：show create table table_name;

删除表

drop table tbname;

修改表

修改表名：al

最低0.47元/天解锁文章

新东方毕业生

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录