Hive Select语句

最新推荐文章于 2024-05-14 14:39:26 发布

给我一台时光机

最新推荐文章于 2024-05-14 14:39:26 发布

阅读量1.2w

点赞数 8

分类专栏： hive 文章标签： hive select

本文链接：https://blog.csdn.net/whyshr/article/details/50978953

版权

hive 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

select

Hive-SQL的基本结构同标准SQL相差不大,结构如下:

select * from employee 
where sex_age.sex='Male'

/*支持使用distinct去重*/
select distinct name,sex_age.sex as sex from employee;
select distinct * from employee; // supported after release 1.1.0

/*hive 0.7.0之后，支持having子句*/
SELECT col1 FROM t1 GROUP BY col1 HAVING SUM(col2) > 10;

/*支持limit子句，限制返回行数*/
SELECT * FROM t1 LIMIT 5; // 随机返回5行

/*返回Top K行*/
SET mapred.reduce.tasks = 1
SELECT * FROM sales SORT BY amount DESC LIMIT 5

在Hive-SQL中,也支持嵌套查询及子查询,但是,不允许出现多重嵌套或多重子查询的情况,常用的子查询方式如下:

/*通过with语句,子查询必须申明别名*/
with t1 as (
select * from employee 
where sex_age.sex='Male'
)
select name,work_place from t1;

/*子查询放置在from子句,子查询必须申明别名*/
select name,sex_age.sex from
(
select * from employee 
where sex_age.sex='Male'
) t1;

/*
子查询放置在where子句,在where子句中的子查询支持in,not in,exists,not exists,但需要注意的是,使用in或not in时,
子查询只能返回一个字段，且使用in/not in不能同时涉及内/外部表
*/
select name,sex_age.sex from
employee a // 外部表必须申明别名，不然Hive报错
where a.name in (
select name from employee
where sex_age.sex='Male'
);

/*
ERROR: 
in/not in不能同时涉及内/外部表，exists/not exists没有这个限制
*/
select name,sex_age.sex from
employee a
where a.name in (
select name from employee
where sex_age.sex!=a.sex_age
);

另外，where子句可以支持任何Boolean表达式及User Define Function。

Partition Based Queries

想要触发hive的分区查询，必须满足以下两个条件：
- 建表时使用了partitioned by
- where条件中包含分区字段，或者join语句的on子句包含分区字段
因为hive只会扫描特定分区的数据，所以，分区查询可以提高查询的速度。

REGEX Column Specification

hive支持通过正则表达式来声明查询字段。

SELECT `(ds|hr)?+.+` FROM sales

给我一台时光机

关注

8
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Hive Select语句

selectHive-SQL的基本结构同标准SQL相差不大,结构如下:select * from employee where sex_age.sex='Male'/*支持使用distinct去重*/select distinct name,sex_age.sex as sex from employee;在Hive-SQL中,也支持嵌套查询及子查询,但是,不允许出现多重嵌套或
复制链接

扫一扫

专栏目录