lateral view 的使用
针对一个实际问题对 lateral view 的使用原理进行拆分
问题:购买每个商品品类的用户数
- 查询表格中的数据:
select * from user_goods_category limit 5;
- 处理的数据表格如下:
user_name | category_detail |
---|---|
Abby | clothes,food,electronics |
Ailsa | book,clothes,food |
Albert | clothes,electronics,computer |
Alexander | clothes,book,computer,electronics |
Alice | clothes,electronics,shoes |
从表中可以看出,category_detail
列中包含多个项目,这种表格并不符合关系型数据库的第一范式(每个属性都不可再分)。
1、首先将 category_detail 拆分开
explode(a)
- separates the elements of array a into multiple rows,
or the elements of a map into multiple rows and columns
将数组 a
的元素分隔为多个行,或将映射的元素分隔为多个行和列
select explode(split(category_detail,","))
from user_goods_category
limit 16;
结果如下:
col |
---|
clothes |
food |
electronics |
book |
clothes |
food |
clothes |
electronics |
computer |
clothes |
book |
computer |
electronics |
clothes |
electronics |
shoes |
explode 说明
explode()
拆分字段后,只含有拆分的字段,不含原表中的字段explode()
的处理对象是array
和map
对象,所以先用split()
将字符串对象 分隔为array
explode(array)
使得结果中将array
列表里的每个元素生成一行;explode(map)
使得结果中将map
里的每一对元素作为一行,key
为一列,value
为一列。
2、将拆分后的字段与原表联系起来
select user_name,b.category
from user_goods_category
lateral view explode(split(category_detail,",")) b as category
limit 16;
结果如下:
user_name | b.category |
---|---|
Abby | clothes |
Abby | food |
Abby | electronics |
Ailsa | book |
Ailsa | clothes |
Ailsa | food |
Albert | clothes |
Albert | electronics |
Albert | computer |
Alexander | clothes |
Alexander | book |
Alexander | computer |
Alexander | electronics |
Alice | clothes |
Alice | electronics |
Alice | shoes |
lateral view 说明
lateral view
为侧视图,意义是为了配合explode
来使用,把某一行数据拆分成多行数据。不加lateral view
的explode
只能提取单个字段拆分,加上lateral view
就可以将拆分的单个字段数据与原始表数据关联上。- 在使用
lateral view
的时候需要指定视图别名和生成的新列别名,此时原表不必指定别名。 lateral view explode(split(category_detail,",")) b as category
中的b
为视图名,categary
为分割后的列的别名
3、查询购买每个商品品类的用户数
基于 lateral view
的拆分和连接结果,可以很轻松的将购买每个商品品类的用户数查询出来:
select c.category,
count(distinct c.user_name) as user_num
from
(select a.user_name,
b.category
from user_goods_category a -- 原表可以不设置别名
lateral view explode(split(category_detail,",")) b as category
limit 16) as c
group by c.category;
查询结果:
c.category | user_num |
---|---|
book | 2 |
clothes | 5 |
computer | 2 |
electronics | 4 |
food | 2 |
shoes | 1 |
lateral view explode 说明
lateral view explode 相当于一个拆分字段的虚表,然后与原表进行关联。