impala: 类似hive的array_contains()函数
- 首先,需要先了解impala中的两个函数的作用,一个是
group_concat(string s [, string sep])
, 一个是find_in_set(string str,string strList)
1、group_concat(string s [, string sep])
按照指定分隔符, 将多行记录的 s 表达式结果拼接起来, 结合group by 一起使用
select id,group_concat(name,'##') from (
select 1 as id,'zhangsan' as name
union all
select 1,'zhangshu'
union all
select 2,'wangwu'
union all
select 2,'wangsan'
union all
select 2,'wangbu'
) t group by id
+-------------++-------------++-------------+
| id | group_concat(name,'##') |
+-------------++-------------++-------------+
| 1 |zhangsan##zhangshu |
+-------------++-------------++-------------+
| 2 |wangwu##wangsan##wangbu |
+-------------++-------------++-------------+
2、find_in_set(string str,string strList)
查找某个字符串在一个以逗号为分隔符的列表中第一次出现的位置(以1为起点),如果查询不到或查询字符串中出现’,’(逗号),返回则为0
--在以逗号间隔的abcdefg中字符c第一次出现的位置
[master:21000] > select find_in_set('c','a,b,c,d,e,f,g') as find_in_set;
+-------------+
| find_in_set |
+-------------+
| 3 |
+-------------+
--在查询','的位置时的返回值
[master:21000] > select find_in_set(',','a,b,c,d,e,f,g') as find_in_set;
+-------------+
| find_in_set |
+-------------+
| 0 |
+-------------+
--在查询不存在字符的位置时的返回值
[master:21000] > select find_in_set('h','a,b,c,d,e,f,g') as find_in_set;
+-------------+
| find_in_set |
+-------------+
| 0 |
+-------------+
两个函数结合实现类似array_contains() 的效果
select id,group_concat(name,','),
find_in_set("zhangshu", group_concat(name,',')) as res1,
find_in_set("wangsan",group_concat(name,',')) as res2
from (
select 1 as id,'zhangsan' as name
union all
select 1,'zhangshu'
union all
select 2,'wangwu'
union all
select 2,'wangsan'
union all
select 2,'wangbu'
) t group by id
+-------------++-------------++-------------++-------------++-------------+
| id | group_concat(name,',') | res1 | res2 |
+-------------++-------------++-------------++-------------++-------------+
| 1 |zhangsan,zhangshu | 2 | 0 |
+-------------++-------------++-------------++-------------++-------------+
| 2 |wangwu,wangsan,wangbu | 0 | 2 |
+-------------++-------------++-------------++-------------++-------------+
最后再结合if
函数,就可以实现类似的功能了