大数据--hive6--实战训练之获取数据并插入到表中

本文介绍了Hive中的数据加载方法,包括命名空间、参数设置、分区操作、聚合函数的使用,以及explode和group by的配合使用。重点讲解了如何处理数据倾斜问题和concat_ws函数的应用。
摘要由CSDN通过智能技术生成

目录

一:获取数据并且插入到表中

二.命名空间

三.通过set设置一些参数以及负载均衡总结

四.分区和concat_ws用法

五.cast和coalesce和case when 的用法

六.使用explode()和group by配合使用进行分组


一:获取数据并且插入到表中

use namespace hehe;
// 名称
set mapred.job.name=v_haha_heheyoutubehehe_ads_people_kehuduan_experience_data_{DATE};

-- 以下选填
set hive.map.aggr=true;  //开启Map 端聚合参数设置
// 有数据倾斜的时候进行负载均衡(默认是false)
set hive.groupby.skewindata=true;

insert overwrite table udw_ns.default.temp_table
partition(event_day='{DATE}',dim_type='heheyoutubehehe_ads_people_kehuduan_experience_data')

select 
concat_ws(
  '\t',
  cast('{DATE}' as string),
  cast(coalesce(os, '-') as string),
  cast(coalesce(soft_version, '-') as string),
  cast(coalesce(is_app_new, '-') as string),
  cast(coalesce(new_cnt, 0) AS string),
  cast(coalesce(is_open_cnt, 0) AS string),
  cast(coalesce(no_open_cnt, 0) AS string),
  cast(coalesce(disp_pv, 0) AS string),
  cast(coalesce(disp_uv, 0) AS string),
  cast(coalesce(agree_clk_uv, 0) AS string),
  cast(coalesce(disagree_clk_uv, 0) AS string),
  cast(coalesce(second_disp_pv, 0) AS string),
  cast(coalesce(second_disp_uv, 0) AS string),
  cast(coalesce(second_agree_clk_uv, 0) AS string),
  cast(coalesce(second_think_clk_uv, 0) AS string),
  cast(coalesce(third_disp_pv, 0) AS string),
  cast(coalesce(third_disp_uv, 0) AS string),
  cast(coalesce(third_check_agreement_clk_uv, 0) AS string),
  cast(coalesce(third_exit_clk_uv, 0) AS string),
) as dim_value
from(
  select 
  C.os as os,
  C.soft_version as soft_version,
  C.is_app_new as is_app_new,
  count(case when k = 'display' and v = 'privacy_pop' then b.cuid end) as disp_pv, -- 弹窗首次引导展现pv(验证)
  count(distinct case when k = 'display' and v = 'privacy_pop' then b.cuid end) as disp_uv, -- 弹窗首次引导展现uv(验证)
  count(distinct case when k = 'agree_clk' and v = 'privacy_pop' then b.cuid end) as agree_clk_uv, -- 弹窗首次引导「同意并继续」点击uv
  count(distinct case when k = 'disagree_clk' and v = 'privacy_pop' then b.cuid end) as disagree_clk_uv, -- 弹窗首次引导「不同意」点击uv
  count(case when k = 'display' and v = 'privacy_pop_again' then b.cuid end) as second_disp_pv, -- 弹窗二次挽留展现pv
  count(distinct case when k = 'display' and v = 'privacy_pop_again' then b.cuid end) as second_disp_uv, -- 弹窗二次挽留展现uv
  count(distinct case when k = 'agree_clk' and v = 'privacy_pop_again' then b.cuid end) a
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值