useful SQL in Presto

expand one, two or multiple arrays into rows (multiple columns)

expand one array

Convert one array to row. E.g. Convert the following

speeds
[1,2,3,4]

to

speed
1
2
3
4

with the following SQL

WITH t AS
  (SELECT array[1,
                2,
                3,
                4] AS speeds)
SELECT speed
FROM t
CROSS JOIN unnest(speeds) AS t(speed)

expand multiple arrays

Convert arrays to rows. E.g. Convert the following

speedscosts
[1,2,3,4][5,6,7,8]

to

speedcost
15
26
37
48

with the following SQL

WITH t AS
  (SELECT array[1,
                2,
                3,
                4] AS speeds, array[5,
                                    6,
                                    7,
                                    8] AS costs)
SELECT speed,
       cost
FROM t
CROSS JOIN unnest(speeds, costs) AS t(speed, cost)

Histogram with equal width

Given a raw table

val
1
6
2
11

and then makes a histogram with width as 5. val_interval: 0 means there are 2 values (which are 1, 2) locating at the interval [0,4]. val_interval: 5, means there is 1 value (which is 6) locating at the interval [5,9].

val_intervaltotal_count
02
51
101

with the SQL

SELECT val / 5 * 5 AS val_interval,
       count(1) AS total_count
FROM t
GROUP BY val / 5 * 5
ORDER BY val / 5 * 5

Numeric Histogram in Presto

Given the input table

speeds
[1,2,3,4,5]

firstly, let’s use the SQL to convert it to table t

speed
1
2
5
8
9

with the SQL

SELECT numeric_histogram(2,speed)
   FROM t
   CROSS JOIN unnest(speeds) AS t(speed)

and then, let’s make a numeric histogram, which separates the table into 2 buckets, and we get the result

speed_intervaltotal_count
2.673.00
8.502.00

with the SQL

WITH t AS
  (SELECT array[1,
                2,
                5,
                8,
                9] AS speeds)
SELECT speed_interval,
       total_count
FROM
  (SELECT numeric_histogram(2,speed)
   FROM t
   CROSS JOIN unnest(speeds) AS t(speed)) AS x(hist)
CROSS JOIN UNNEST(hist) AS t (speed_interval, total_count)

Zip, filter in Presto

SQL

SELECT filter(cast(zip(array[1,2,3,4], array[5,6,7,8]) AS array<Row(speeds bigint, costs bigint)>), x -> x.speeds >= 3) as filtered

and we get

filtered
[[3,7],[4,8]]
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值