Apache Impala View、order by、group by、having子句、with子句、limit、offset、distinct

最新推荐文章于 2023-06-01 10:58:35 发布

大数据流浪法师

最新推荐文章于 2023-06-01 10:58:35 发布

阅读量1.7k

点赞数

分类专栏： # Impala 文章标签： Apache Impala View视图 order by group by having with limit

本文链接：https://blog.csdn.net/weixin_42072754/article/details/103483151

版权

Impala 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

view视图

视图仅仅是存储在数据库中具有关联名称的Impala查询语言的语句。

它是以预定义的SQL查询形式的表的组合。

视图可以包含表的所有行或选定的行。

Create View IF NOT EXISTS view_name as Select statement

创建视图view、查询视图view

CREATE VIEW IF NOT EXISTS employee_view AS select name, age from employee;

[hadoop03.Hadoop.com:21000] > select * from employee;
Query: select * from employee
Query submitted at: 2019-12-10 21:14:35 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=1345b7c2c1089195:f2cb775f00000000
+----+----------+-----+-----------+--------+
| id | name     | age | address   | salary |
+----+----------+-----+-----------+--------+
| 1  | Ramesh   | 32  | Ahmedabad | 20000  |
| 5  | Hardik   | 27  | Bhopal    | 40000  |
| 6  | Komal    | 22  | MP        | 32000  |
| 3  | kaushik  | 23  | Kota      | 30000  |
| 2  | Khilan   | 25  | Delhi     | 15000  |
| 4  | Chaitali | 25  | Mumbai    | 35000  |
+----+----------+-----+-----------+--------+
Fetched 6 row(s) in 1.75s
[hadoop03.Hadoop.com:21000] > CREATE VIEW IF NOT EXISTS employee_view AS select name, age from employee;
Query: create VIEW IF NOT EXISTS employee_view AS select name, age from employee
Query submitted at: 2019-12-10 21:16:58 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=ac46f20ebe4e3e4b:4c8f1b5500000000
Fetched 0 row(s) in 1.44s
[hadoop03.Hadoop.com:21000] > select * from employee_view;
Query: select * from employee_view
Query submitted at: 2019-12-10 21:18:47 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=b04c71302c850c75:37e5068000000000
+----------+-----+
| name     | age |
+----------+-----+
| kaushik  | 23  |
| Chaitali | 25  |
| Hardik   | 27  |
| Khilan   | 25  |
| Komal    | 22  |
| Ramesh   | 32  |
+----------+-----+
Fetched 6 row(s) in 6.06s

修改视图

ALTER VIEW database_name.view_name as Select语句

[hadoop03.Hadoop.com:21000] > alter view hahaha.employee_view as select address from employee;
Query: alter view hahaha.employee_view as select address from employee
Fetched 0 row(s) in 0.10s
[hadoop03.Hadoop.com:21000] > select * from employee_view;
Query: select * from employee_view
Query submitted at: 2019-12-10 21:20:18 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=a84fcd38d30306dd:a79500b800000000
+-----------+
| address   |
+-----------+
| MP        |
| Kota      |
| Delhi     |
| Ahmedabad |
| Mumbai    |
| Bhopal    |
+-----------+
Fetched 6 row(s) in 2.75s

删除视图

DROP VIEW database_name.view_name;

[hadoop03.Hadoop.com:21000] > drop view hahaha.employee_view;
Query: drop view hahaha.employee_view
[hadoop03.Hadoop.com:21000] > select * from employee_view;   
Query: select * from employee_view
Query submitted at: 2019-12-10 21:21:02 (Coordinator: http://hadoop03:25000)
ERROR: AnalysisException: Could not resolve table reference: 'employee_view'

order by子句

Impala ORDER BY子句用于根据一个或多个列以升序或降序对数据进行排序。

默认情况下，一些数据库按升序对查询结果进行排序。

select * from table_name ORDER BY col_name
 [ASC|DESC] [NULLS FIRST|NULLS LAST]

可以使用关键字ASC或DESC分别按升序或降序排列表中的数据。

[hadoop03.Hadoop.com:21000] > select * from employee order by salary;
Query: select * from employee order by salary
Query submitted at: 2019-12-10 21:22:33 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=ed4e4cefd171cbd2:1366f4e900000000
+----+----------+-----+-----------+--------+
| id | name     | age | address   | salary |
+----+----------+-----+-----------+--------+
| 2  | Khilan   | 25  | Delhi     | 15000  |
| 1  | Ramesh   | 32  | Ahmedabad | 20000  |
| 3  | kaushik  | 23  | Kota      | 30000  |
| 6  | Komal    | 22  | MP        | 32000  |
| 4  | Chaitali | 25  | Mumbai    | 35000  |
| 5  | Hardik   | 27  | Bhopal    | 40000  |
+----+----------+-----+-----------+--------+
Fetched 6 row(s) in 1.94s
[hadoop03.Hadoop.com:21000] > select * from employee order by salary desc;
Query: select * from employee order by salary desc
Query submitted at: 2019-12-10 21:22:39 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=b549a0bcc35d56a8:8da1b60d00000000
+----+----------+-----+-----------+--------+
| id | name     | age | address   | salary |
+----+----------+-----+-----------+--------+
| 5  | Hardik   | 27  | Bhopal    | 40000  |
| 4  | Chaitali | 25  | Mumbai    | 35000  |
| 6  | Komal    | 22  | MP        | 32000  |
| 3  | kaushik  | 23  | Kota      | 30000  |
| 1  | Ramesh   | 32  | Ahmedabad | 20000  |
| 2  | Khilan   | 25  | Delhi     | 15000  |
+----+----------+-----+-----------+--------+
Fetched 6 row(s) in 0.13s

如果使用NULLS FIRST，表中的所有空值都排列在顶行;
如果使用NULLS LAST，包含空值的行将最后排列。

group by子句

Impala GROUP BY子句与SELECT语句协作使用，以将相同的数据排列到组中。
select data from table_name Group BY col_name;

having子句

Impala中的Having子句允许您指定过滤哪些组结果显示在最终结果中的条件。
一般来说，Having子句与group by子句一起使用; 它将条件放置在由GROUP BY子句创建的组上。

limit、offset

Impala中的limit子句用于将结果集的行数限制为所需的数，即查询的结果集不包含超过指定限制的记录。

一般来说，select查询的resultset中的行从0开始。使用offset子句，我们可以决定从哪里考虑输出。

[hadoop03.Hadoop.com:21000] > select * from employee limit 2 offset 1;     
Query: select * from employee limit 2 offset 1
Query submitted at: 2019-12-10 21:31:37 (Coordinator: http://hadoop03:25000)
ERROR: AnalysisException: OFFSET requires an ORDER BY clause: LIMIT 2 OFFSET 1

[hadoop03.Hadoop.com:21000] > select * from employee order by salary limit 2 offset 1;
Query: select * from employee order by salary limit 2 offset 1
Query submitted at: 2019-12-10 21:32:50 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=df4d53a3945aa47c:54af08a200000000
+----+---------+-----+-----------+--------+
| id | name    | age | address   | salary |
+----+---------+-----+-----------+--------+
| 1  | Ramesh  | 32  | Ahmedabad | 20000  |
| 3  | kaushik | 23  | Kota      | 30000  |
+----+---------+-----+-----------+--------+
Fetched 2 row(s) in 1.66s
[hadoop03.Hadoop.com:21000] > select * from employee order by id limit 2 offset 1;    
Query: select * from employee order by id limit 2 offset 1
Query submitted at: 2019-12-10 21:33:27 (Coordinator: http://hadoop03:25000)
Query progress can be monitored at: http://hadoop03:25000/query_plan?query_id=d64ef7636b6bbc3f:262e576900000000
+----+---------+-----+---------+--------+
| id | name    | age | address | salary |
+----+---------+-----+---------+--------+
| 2  | Khilan  | 25  | Delhi   | 15000  |
| 3  | kaushik | 23  | Kota    | 30000  |
+----+---------+-----+---------+--------+
Fetched 2 row(s) in 0.36s

with子句

如果查询太复杂，我们可以为复杂部分定义别名，并使用Impala的with子句将它们包含在查询中。
with x as (select 1), y as (select 2) (select * from x union y);
例如：使用with子句显示年龄大于25的员工和客户的记录。

with t1 as (select * from customers where age>25), 
   t2 as (select * from employee where age>25) 
   (select * from t1 union select * from t2);

distinct

Impala中的distinct运算符用于通过删除重复值来获取唯一值。

select distinct columns… from table_name;

大数据流浪法师

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录