CTE
概述
公共表表达式(CTE)是从WITH子句指定的简单查询中派生的临时结果集,它紧跟在SELECT或INSERT关键字之前,CTE仅在单个语句的执行范围内定义。Hive中一个或多个CTE可与SELECT, INSERT, CREATE TABLE AS SELECT, 或 CREATE VIEW AS SELECT语句一起使用。Hive在0.13中加入了CTE,见 HIVE-1180。CTE的语法格式如下
withClause: cteClause (, cteClause)*
cteClause: cte_name AS (select statment)
说明:
- 在子查询模块中不支持WITH语句
- 在Views,CTAS和INSERT语句中可以支持CTE
- CTE不支持递归查询
示例
SELECT语句中使用CTE
with q1 as ( select key from src where key = '5')
select *
from q1;
-- from style
with q1 as (select * from src where key= '5')
from q1
select *;
-- chaining CTEs
with q1 as ( select key from q2 where key = '5'),
q2 as ( select key from src where key = '5')
select * from (select key from q1) a;
-- union example
with q1 as (select * from src where key= '5'),
q2 as (select * from src s2 where key = '4')
select * from q1 union all select * from q2;
Views,CTAS和INSERT语句中使用CTE
-- insert 示例
create table s1 like src;
with q1 as ( select key, value from src where key = '5')
from q1
insert overwrite table s1
select *;
-- ctas 示例
create table s2 as
with q1 as ( select key from src where key = '4')
select * from q1;
-- view 示例
create view v1 as
with q1 as ( select key from src where key = '5')
select * from q1;
select * from v1;
-- view 示例, name collision
create view v1 as
with q1 as ( select key from src where key = '5')
select * from q1;
with q1 as ( select key from src where key = '4')
select * from v1;
子查询
子查询又称嵌套查询,是从另一个查询的结果的投影数据的查询,通常是在FROM和WHERE子句中使用。通过使用子查询或者CTE,多个SELECT语句可以实现复杂的查询任务。使用子查询时,应为内部查询提供别名。
示例1:在FROM子句中使用子查询
SELECT
name, gender_age.gender as gender
FROM (
SELECT * FROM employee WHERE gender_age.gender = ‘Male’
) t1; – 这里的t1是必须要有的
±---------±--------+
| name | gender |
±---------±--------+
| Michael | Male |
| Will | Male |
±---------±--------+
上面的示例可以使用CTE重写,对于一些复杂的SQL查询,建议使用CTE这种方式。
WITH t1 as (
SELECT * FROM employee WHERE gender_age.gender = ‘Male’
)
SELECT name, gender_age.gender as gender
FROM t1;
±---------±--------+
| name | gender |
±---------±--------+
| Michael | Male |
| Will | Male |
±---------±--------+
示例2:在WHERE子句中使用子查询
- IN/NOT IN
> SELECT
name, gender_age.gender as gender
FROM employee
WHERE name IN
(SELECT name FROM employee WHERE gender_age.gender = 'Male');
+----------+----------+
| name | gender |
+----------+----------+
| Michael | Male |
| Will | Male |
+----------+----------+
- EXISTS/NOT EXISTS
-- 使用EXISTS/NOT EXISTS的子查询必须同时引用内部和外部表达式
> SELECT
name, gender_age.gender as gender
FROM employee a
WHERE EXISTS (
SELECT *
FROM employee b
WHERE
a.gender_age.gender = b.gender_age.gender AND b.gender_age.gender = 'Male'
); -- 这相当于使用字段gender对表a和b进行join
+----------+----------+
| name | gender |
+----------+----------+
| Michael | Male |
| Will | Male |
+----------+----------+
在WHERE子句中使用子查询有如下一些限制
- 子查询只能出现在where子句的右侧
- 不支持嵌套子查询
- 尽管使用正则表达式可以支持多个列,但是在IN/NOT IN子查询只支持使用单个列
参考
https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries