Hadoop之hive中sql常用函数汇总

最新推荐文章于 2024-05-05 09:00:00 发布

随风奔跑之水

最新推荐文章于 2024-05-05 09:00:00 发布

阅读量3.8k

点赞数 2

分类专栏：数据仓库（hive） Hadoop

本文链接：https://blog.csdn.net/weixin_40873462/article/details/90176217

版权

1、hive执行引擎 mr/tez/spark

set hive.execution.engine = mr;

2、开启动态分区

set hive.exec.dynamic.partition = true;
set hive.exec.dynamic.partition.mode = nonstrict;

## 删除分区：
ALTER TABLE dm.user_action_self_help_w_wi DROP IF EXISTS PARTITION (dt='2019-08-15',pd=2);

3、with 连接词

with TABLE_NAME AS (
    SELECT ... FROM ... WHERE ...
)

-- 首个连接需要with，后续不要with：

TABLE_NAME AS (
    SELECT ... FROM ... WHERE ...
)

4、为字段重命名

old_name as new_name

-- 或(不加as)：

old_name new_name

5、row_number() over(partition by A order by B asc/desc)

row_number() over(partition by A,B,C order by D asc/desc)

-- 将查询结果按照A,B,C字段分组（partition），
-- 然后组内按照D字段排序，至于asc还是desc，可自行选择，
-- 然后为每行记录返回一个row_number用于标记顺序(编号)

特色功能1：给 已有hive表(dm.official_accounts_funscount_w) 添加一列序号(sample_key)，例：
select 
  row_number() over(
    partition by case when t.source is not null then 1 end
    order by t.source asc,t.funCounts desc
    ) as sample_key,
  t.source,
  t.cityName,
  t.weight,
  t.strArea,
  t.end_date,
  t.funCounts
from dm.official_accounts_funscount_w t;

特色功能2：给表（多个字段）中某个字段去重，例：
-- 临时表2：去重数据
drop table if exists dm.table_info__02;
create table dm.table_info_02 stored as parquet as 
select
*
from
(
select
*,
row_number() over(partition by id order by time desc) as rn
from dm.table_info_01
) a
where a.rn = 1;

6、concat(a,b,c...)

最低0.47元/天解锁文章

随风奔跑之水

关注

2
点赞
踩
21

收藏

觉得还不错? 一键收藏
1
评论
Hadoop之hive中sql常用函数汇总

1、hive执行引擎 mr/tez/sparkset hive.execution.engine = mr;2、开启动态分区set hive.exec.dynamic.partition = true;set hive.exec.dynamic.partition.mode = nonstrict;## 删除分区：ALTER TABLE dm.user_action_sel...
复制链接

扫一扫