hive01

最新推荐文章于 2023-03-19 13:47:42 发布

PythonCN1

最新推荐文章于 2023-03-19 13:47:42 发布

阅读量196

点赞数

本文链接：https://blog.csdn.net/PythonCN1/article/details/101307957

版权

查看表结构 hive> desc 表名
创建文件 vi 表名.txt
上传到hadoop hadoop fs -put 表名.txt 存放路径
查看表信息 select * from 表名
创建表 create table 表名 (信息类型，信息类型，）
row format delimited
fields terminated by ‘,’；
删除表 hive> drop table 表名
内部表删除表时元数据和表内信息一起删除
外部表删除表时不删除表内信息，只删除元数据

分区关键字 partitoned

创建分区表
hive> create table 表名（表结构）
partitioned by (day string)
row format delimited
fields terminated by ‘,’;
创建文件 vi pv.data.日期
导入数据到不同的目录
load data local inpath ‘文件地址+文件名’ into table 表名 partition(day=‘日期’);
查看某天访问人数
hive> select distinct ip from 表名 where day=‘日期’;
覆盖表
load data local inpath ‘文件路径+文件名’ overwrite into table 想要覆盖的表名；
从别的表查询数据后插入到一张新建表中
hive> create table 表名
as
select 想要查看的数据 from 想要查看数据的表名；
把别的表数据插入到一张已存在的表中
建表 create table 表名 like 数据表
hive> insert into table 表名
select
想要插入的数据
from 数据表
where name = ‘lis’
关于分区数据导入另外一张表
insert into table 表名 partition(day=‘日期’) select 想要的数据 from 数据表 where day=‘日期’;
将数据从hive中导出到hdfs的目录中
insert overwrite directory’要存放的目录’
select * from 表名 where name = ‘lis’;
将数据从hive导出到本地磁盘
insert overwrite local directory’本地目录’
select * from 表名；
先创建一张表t_seq，指定文件格式为equencefile
hive> create table t_seq(id int,name string)
stored as sequencefile;
然表t_seq中插入数据，hive就会生成sequence文件插入表目录中
hive> insert into table t_seq
select * from test_1 ;

修改表的分区

查看表的分区 show partitions 表名
向新增的分区中导入数据：
hive> load data local inpath ‘/root/pv.data.2019-05-12’ into table test_4 partition(day=‘2019-05-12’);
hive> select * from test_4 where day=‘2019-05-12’;
修改表分区
Hive> insert into table test_4 partition(day=‘2019-05-13’)
select ip,url,staylong from test_4 where day=‘2019-05-11’ and staylong>20;
删除分区
alter table 表名 drop partition (day=‘日期’)