hive优化之去distinct

count(distinct ),在数据量大的情况下,容易数据倾斜,因为count(distinct)是按group by 字段分组,按distinct字段排序。

1.单个distinct

Select device_name,count(distinct imei) from TableA group by device_name;

使用group by替换:

Select device_name,count(imei)
from
(
    select
	    device_name,
		imei
	from TableA
    Group by device_name,imei
)a
group by device_name;

2.distinct 涉及两个字段:

select
    device_name,
    count(distinct imei)
    count(distinct app_id)
from table1
group by device_name

第一种方法:

SELECT
    device_name,
    SUM(imei_cnt)  AS    imei_cnt,
    SUM(app_id_cnt) AS  app_id_cnt
FROM
(    
    select
        device_name,
        count(imei) as imei_cnt,
        0,          as app_id_cnt
    from
    (
        select
            device_name,
            imei
        from table1
        group by device_name,imei
    )a
    group by device_name
    UNION ALL
    select
        device_name,
        0             as imei_cnt,
        count(app_id) as app_id_cnt
    from
    (
        select
            device_name,
            app_id
        from table1
        group by device_name,app_id
    )a
    group by device_name
     
)T
GROUP BY device_name;

第二种方法:

SELECT
    a.device_name,
    imei_cnt,
    app_id_cnt
FROM
(    
    select
        device_name,
        count(imei) as imei_cnt
    from
    (
        select
            device_name,
            imei
        from table1
        group by device_name,imei
    )a
    group by device_name
)a
JOIN
(
    select
        device_name,
        count(app_id) as app_id_cnt
    from
    (
        select
            device_name,
            app_id
        from table1
        group by device_name,app_id
    )a
    group by device_name
)b
ON a.device_name=b.device_name

亲测正确!!!

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值