=============hql补强点!练习题===============================
01,01,80
01,02,90
01,03,99
02,01,70
02,03,80
03,03,80
04,01,50
04,02,30
create table test1(
name int,
course int,
score int
)
row format delimited
fields terminated by ',';
left join(很好的利用了反向,) --- 推荐!
select *
from test1 t
left join(
select name
from test1
where score < 80
) tmp
on t.name = tmp.name
where tmp.name is null;
----
编写sql完成如下查询,一次查询实现最好,也可以写多次查询实现:
1、查出每个学期每门课程最高分记录(包含全部5个字段)
---- 又有聚合结果,又有详细信息; 1、开窗(聚合操作) 2、自连接,双join
select id,
userid,
course,
score,
term
from (
select id,
userid,
course,
score,
term,
row_number() over (partition by term,course order by score desc) maxScore
from course_score
) tt
where tt.maxScore=1;
---
select
s.id,
s.userid,
s.course,
s.score,
s.term
from
course_score s
join
(select
course,
userid,
max(score) score
from course_score
group by course,userid) t1
on s.course=t1.course ----》 多个条件确定唯一一行值
and
s.userid=t1.userid
and
s.score=t1.score
;
2、查出单个学期中语文课在90分以上的学生的数学成绩记录(包含全部5个字段)
---- 理解有些偏差,符合条件的单个学期的学生对应的该学期的数学成绩,还是对应所有学期数学成绩(注意:的,的,理解就对了-这里偏所有)
select
s.id,
s.userid,
s.course,
s.score,
s.term
from
course_score s
join
(select
id,
userid,
course,
score,
term
from
course_score
where
score >=70
and
course="语文") t1
on s.userid=t1.userid
where s.course="数学"
and s.term = t1.term ----》 加一个判断就是,对应的该学期的数学成绩
;
解法二:select
*
from course_score t1
join course_score t2
on t1.userid = t2.userid and t1.course='语文' and t1.score>=90
and t2.id is not null and t2.course='数学'
---------------------------------
5分钟连续点击日志3次(当前点击日志时间 - 前3次点击时间 < 300s)
1,2011-12-07 13:01:03
1,2011-12-07 13:03:04
1,2011-12-07 13:02:03
1,2011-12-07 13:06:04
1,2011-12-07 13:08:09
2,2011-12-07 13:01:12
2,2011-12-07 13:01:15
2,2011-12-07 13:02:12
2,2011-12-07 13:03:12
2,2011-12-07 13:04:12
3,2011-12-07 13:04:12
3,2011-12-07 13:05:12
3,2011-12-07 13:07:12
3,2011-12-07 13:06:12
3,2011-12-07 13:10:12
create table url(
uid int,
dt timestamp
)
row format delimited
fields terminated by ',';
load data local inpath '/root/url.txt' into table url;
select distinct tmp.uid
from (

这篇博客主要展示了Hive在数据处理和分析中的应用,包括如何利用HQL进行左连接查询,找出每个学期每门课程的最高分记录,以及如何识别5分钟内连续点击日志等实际问题的解决方案。通过具体的SQL示例,深入探讨了Hive在大数据场景下的功能和技巧。
最低0.47元/天 解锁文章
795

被折叠的 条评论
为什么被折叠?



