0923 hive2

最新推荐文章于 2024-03-04 09:02:54 发布

ruanmianmian1

最新推荐文章于 2024-03-04 09:02:54 发布

阅读量97

点赞数

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/ruanmianmian1/article/details/101223536

版权

hive 函数 split（字段名，“ ”）切分 explode（array）将数组里内容展开

from (select explode(split(line, ' ')) as word from docs) w

insert into table wc

select word, count(1) as totalword

group by word

order by word; hive版wc

hive动态分区一组没有分区的数据要动态的给他加上分区

set hive.exec.dynamic.partition.mode=nostrict; （设置这个可以不用有一个是静态分区）

from tb_user2

insert overwrite table tb_user1 partition (age, sex)

select id, name, likes, addrs, age, sex distribute by age, sex;

如果没设置上面的参数

hive> from tb_user1

> insert overwrite table tb_user partition(age=20,sex)

> select id, name, likes, addrs, sex distribute by sex;

分桶操作（适用于数据抽样）

clustered by (id) into 4 buckets 创建表时添加查询时select * from tb_user2 tablesample(bucket 3 out of 4 on id);

如果在创建时是32 在1 out of 16 意思是32个痛分两部分一部分16 1则是每一部分的第一个桶内的数据

lateral view 可以是select 中带有多个udtf函数（explode）

select count(distinct(myCol1)), count(distinct(myCol2)) from psn2

LATERAL VIEW explode(likes) myTable1 AS myCol1

LATERAL VIEW explode(address) myTable2 AS myCol2, myCol3;

视图 create view

索引

create index t1_index on table tb_user2(name)

as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' with deferred rebuild

in table t1_index_table; in table 后是索引表名字

show index on tb_user2;

drop index t1_index on tb_user2;

hive优化（未完）

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
0923 hive2

hive 函数 split（字段名，“ ”）切分 explode（array）将数组里内容展开from (select explode(split(line, ' ')) as word from docs) winsert into table wcselect word, count(1) as totalwordgrou...
复制链接

扫一扫

评论 1

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。