HIVE中的表操作

最新推荐文章于 2023-12-13 14:21:39 发布

ITLV007

最新推荐文章于 2023-12-13 14:21:39 发布

阅读量310

点赞数 2

本文链接：https://blog.csdn.net/itlv007/article/details/93129914

版权

表的种类

1.内部表
2.外部表
3.分区表
4.分通表
扩展: 临时表只有在进程中有效进程结束表所有数据删除与内部表类似

表的操作

show databases ----- 查看数据库
show tables -----查看表
use 数据库名----进入数据库
drop 数据库名 ----删除数据库
drop 表名----删除表

内部表

内部表创建的方式与mysql中的创建表的方式相同

create table 表名(
field type,
fields type,
…)
row format delimited 设置行格式分隔
fields terminated by " 自己数据的格式" 根据格式解析文本的字段
collection items terminated by “,” 集合以,分隔
map keys terminated by “:” map key value 以:分隔
lines terminated by “\n” ; 每条记录以回车分隔

插入数据分为两种

insert

insert into 表名 values(value,value…);

本地加载:

load data local inpath ‘本地数据文件路径’ into table 表名

hdfs加载:

load data inpath ‘hdfs://主机名:9000/文件路径’ into table 表名

外部表

外部表中的关键字 external(外部的) 与内部表的创建代码不同的就是 table 前加不加external

创建外部表

create external table 表名(
field type,
fields type,
…)
row format delimited 设置行格式分隔
fields terminated by " 自己数据的格式" 根据格式解析文本的字段
collection items terminated by “,” 集合以,分隔
map keys terminated by “:” map key value 以:分隔
lines terminated by “\n” ; 每条记录以回车分隔

导入别的表数据

insert into table 新表
select (与新表中字段对应) from 有数据的表

分区表

区表的优点:提高处理计算效率
hive中的分区表的partition就是分成不同的文件目录进行存储
分区表又分为两种 : 静态分区和动态分区
分区表中也分内部表和外部表
在创建的时候来设定分区注意分区的字段和表里设置的字段不能一直不然会出现两个一样的字段

create [external] table 表名(
field type,
fields type,
…)
partitioned by (fields type)
row format delimited 设置行格式分隔
fields terminated by " 自己数据的格式" 根据格式解析文本的字段
collection items terminated by “,” 集合以,分隔
map keys terminated by “:” map key value 以:分隔
lines terminated by “\n” ; 每条记录以回车分隔

只有一个分区在hdfs中想当于文件/partition
两个分区 /partition/partition2
多个分区 /partition/partition2/…/…

分区的时候要根据业务需求，提前进行相应的设定年月日时分秒----为了一个分区中的内容，提高计算效率