使用Hive查询mysql经典50题
一、mysql经典50道题
学生表 Student
student_id | student_name | birth | sex |
---|---|---|---|
1 | 赵雷 | 1990-01-01 | 男 |
2 | 钱电 | 1990-12-21 | 男 |
3 | 孙风 | 1990-05-20 | 男 |
4 | 李云 | 1990-08-06 | 男 |
5 | 周梅 | 1991-12-01 | 女 |
6 | 吴兰 | 1992-03-01 | 女 |
7 | 郑竹 | 1989-07-01 | 女 |
8 | 王菊 | 1990-01-20 | 女 |
课程表 Course
course_id | course_name | teacher_id |
---|---|---|
1 | 语文 | 2 |
2 | 数学 | 1 |
3 | 英语 | 3 |
教师表 Teacher
teacher_id | teacher_name |
---|---|
1 | 张三 |
2 | 李四 |
3 | 王五 |
成绩表 Score
student_id | course_id | score |
---|---|---|
1 | 1 | 80 |
1 | 2 | 90 |
1 | 3 | 99 |
2 | 1 | 70 |
2 | 2 | 60 |
2 | 3 | 80 |
3 | 1 | 80 |
3 | 2 | 80 |
3 | 3 | 80 |
4 | 1 | 50 |
4 | 2 | 30 |
4 | 3 | 20 |
5 | 1 | 76 |
5 | 2 | 87 |
6 | 1 | 31 |
6 | 3 | 34 |
7 | 2 | 89 |
7 | 3 | 98 |
查询问题:
1、查询"01"课程比"02"课程成绩高的学生的信息及课程分数:
2、查询"01"课程比"02"课程成绩低的学生的信息及课程分数:
3、查询平均成绩大于等于60 分的同学的学生编号和学生姓名和平均成绩:
4、查询平均成绩小于60 分的同学的学生编号和学生姓名和平均成绩(包括有成绩的和无成绩的):
5、查询所有同学的学生编号、学生姓名、选课总数、所有课程的总成绩:
9、查询学过编号为"01"并且也学过编号为"02"的课程的同学的信息:
10、查询学过编号为"01"但是没有学过编号为"02"的课程的同学的信息:
12、查询至少有一门课与学号为"01"的同学所学相同的同学的信息:
13、查询和"01"号的同学学习的课程完全相同的其他同学的信息:
15、查询两门及其以上不及格课程的同学的学号,姓名及其平均成绩:
16、检索"01"课程分数小于60,按分数降序排列的学生信息:
17、按平均成绩从高到低显示所有学生的所有课程的成绩以及平均成绩:
18、查询各科成绩最高分、最低分和平均分:以如下形式显示:课程ID,课程name,最高分,最低分,平均分,及格率,中等率,优良率,优秀率:
22、查询所有课程的成绩第2 名到第3 名的学生信息及该课程成绩:
23、统计各科成绩各分数段人数:课程编号,课程名称,[100-85],[85-70],[70-60],[0-60]及所占百分比:
32、查询每门课程的平均成绩,结果按平均成绩降序排列,平均成绩相同时,按课程编号升序排列:
33、查询平均成绩大于等于85的所有学生的学号、姓名和平均成绩:
34、查询课程名称为"数学",且分数低于60的学生姓名和分数:
36、查询任何一门课程成绩在70分以上的学生姓名、课程名称和分数:
38、查询课程编号为01且课程成绩在80分以上的学生的学号和姓名:
40、查询选修"张三"老师所授课程的学生中,成绩最高的学生信息及其成绩:
41、查询不同课程成绩相同的学生的学生编号、课程编号、学生成绩:
43、统计每门课程的学生选修人数(超过5人的课程才统计)要求输出课程号和选修人数,查询结果按人数降序排列,若人数相同,按课程号升序排列
二、使用hive查询50道题
2.1 建库建表并插入数据
①建库:
create database if not exists hql50;
use hql50;
②建表:
-- 课程表:
create table if not exists course(
course_id int,
course_name string,
teacher_id int
);
-- 分数表:
create table if not exists score(
student_id int,
course_id int,
score int
);
-- 学生表:
create table if not exists student(
student_id int,
student_name string,
student_birth string,
student_sex string
);
-- 教师表:
create table if not exists teacher(
teacher_id int,
teacher_name string
);
③插入数据:
-- 课程表
insert into table course values
(01,'语文',02),
(02,'数学',01),
(03,'英语',03);
-- 分数表
insert into table score values
(01,01,80),
(01,02,90),
(01,03,99),
(02,01,70),
(02,02,60),
(02,03,80),
(03,01,80),
(03,02,80),
(03,03,80),
(04,01,50),
(04,02,30),
(04,03,20),
(05,01,76),
(05,02,87),
(06,01,31),
(06,03,34),
(07,02,89),
(07,03,98);
-- 学生表
insert into table student values
(01,'赵雷', '1990-01-01','男'),
(02,'钱电', '1990-12-21','男'),
(03,'孙风', '1990-05-20','男'),
(04,'李云', '1990-08-06','男'),
(05,'周梅', '1991-12-01','女'),
(06,'吴兰', '1992-03-01','女'),
(07,'郑竹', '1989-07-01','女'),
(08,'王菊', '1990-01-20','女');
-- 教师表
insert into table teacher values (01,'张三'),(02,'李四'),(03,'王五');
2.2 查询
注:
- 由于是针对sql语句的练习,数量量很小,此处设置hive本地模式运行,不用执行map reduce,会提高查询效率
set hive.exec.mode.local.auto=true;
set hive.exec.mode.local.auto.inputbytes.max=50000000;
set hive.exec.mode.local.auto.tasks.max=10;
- 使用 Zeppelin 插件查询,查询结果更加简洁直观
zeppelin相关配置:zeppelin环境搭建及连接hive使用
1、查询"01"课程比"02"课程成绩高的学生的信息及课程分数:
select s.*,sc.course_id,sc.score
from student s
join score sc
on s.student_id = sc.student_id
join
(
select sc1.student_id
from (select * from score where course_id = 1) sc1
join (select * from score where course_id = 2)sc2
on sc1.student_id = sc2.student_id
where sc1.score > sc2.score
) t
on s.student_id = t.student_id;
2、查询"01"课程比"02"课程成绩低的学生的信息及课程分数:
select s.*,sc.course_id,sc.score
from student s
join score sc
on s.student_id = sc.student_id
join
(
select sc1.student_id
from (select * from score where course_id = 1) sc1
join (select * from score where course_id = 2)sc2
on sc1.student_id = sc2.student_id
where sc1.score < sc2.score
) t
on s.student_id = t.student_id;
3、查询平均成绩大于等于60 分的同学的学生编号和学生姓名和平均成绩:
select sc.student_id,s.student_name,avg(score) avg_score
from student s
join score sc
on s.student_id = sc.student_id
group by sc.student_id,s.student_name
having avg_score >= 60;
4、查询平均成绩小于60 分的同学的学生编号和学生姓名和平均成绩(包括有成绩的和无成绩的):
select s.student_id,s.student_name,avg(sc.score) avg_score
from student s
left join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name
having avg_score <60 or avg_score is null;
5、查询所有同学的学生编号、学生姓名、选课总数、所有课程的总成绩:
select s.student_id,s.student_name,count(sc.course_id) count_course,sum(score) sum_score
from student s
left join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name;
select count(1)
from teacher
where teacher_name like'李%';
select s.*,sc.course_id,sc.score
from student s
join score sc
on s.student_id = sc.student_id
join course co
on sc.course_id = co.course_id
join teacher t
on t.teacher_id = co.teacher_id
where t.teacher_name = '张三';
思路:
- 学生表s 全部的学生id有1,2,3,4,5,6,7,8
- 学过张三老师课程的学生id有(临时表t)1,2,3,4,5,7
- s left join t ,关联条件为student_id,当右表 t.student_id为空时,即为没学过"张三"老师课程的学生
- 也可以使用 not in ,容易报错,不推荐使用
select s.*
from student s
left join (
select s1.student_id
from student s1
join score sc
on s1.student_id = sc.student_id
join course co
on sc.course_id = co.course_id
join teacher t
on t.teacher_id = co.teacher_id
where t.teacher_name = '张三'
)t on s.student_id = t.student_id
where t.student_id is null;
9、查询学过编号为"01"并且也学过编号为"02"的课程的同学的信息:
select s.*,sc1.course_id,sc1.score,sc2.course_id,sc2.score
from student s
join score sc1
on s.student_id = sc1.student_id
join score sc2
on sc1.student_id = sc2.student_id
where sc1.course_id = 1 and sc2.course_id = 2;
10、查询学过编号为"01"但是没有学过编号为"02"的课程的同学的信息:
思路:
- 查询学过"01"课的学生id, left join 学过"02"课的学生id,左表为空时即答案(右表中没有我就表示我学过1但是没学过2)
--查询学过"01"课的学生id
select student_id
from score
where course_id = 1
--查询学过"02"课的学生id
select student_id
from score
where course_id = 2
--查询学过编号为"01"但是没有学过编号为"02"的课程的同学id
select sc1.student_id
from(
select student_id
from score
where course_id = 1
) sc1
left join(
select student_id
from score
where course_id = 2
) sc2 on sc1.student_id = sc2.student_id
where sc2.student_id is null;
--再join一次student,查询该学生信息
select s.*
from student s
join
(select sc1.student_id
from(
select student_id
from score
where course_id = 1
) sc1
left join(
select student_id
from score
where course_id = 2
) sc2 on sc1.student_id = sc2.student_id
where sc2.student_id is null
) t on s.student_id = t.student_id;
思路: student s left join (学过所有课程的学生id) t on s.student_id = t.student_id where t.student_id is null
方法一:分步查询
--查询一共几门课:
select count(1) from course;
-- 3
--查询没有学全所有课程的同学的信息:
select s.*
from student s
left join(
select sc.student_id,count(course_id) count_course
from score sc
group by student_id
having count_course = 3
) t on s.student_id = t.student_id
where t.student_id is null;
方法二:一步到位
select s.*
from student s
join(select count(1) num from course) co
left join(
select student_id,count(course_id) num
from score sc
group by student_id
) sc on s.student_id = sc.student_id and co.num = sc.num
where sc.student_id is null;
<hr>
12、查询至少有一门课与学号为"01"的同学所学相同的同学的信息:
思路: 查询"01"号学生学过的课程信息join course course_id,再排除"01"号学生
select distinct s.*
from student s
join score sc
on s.student_id = sc.student_id
join course co
on sc.course_id = co.course_id
join(
select sc.course_id cour
from score sc
join course co
on sc.course_id = co.course_id
where sc.student_id = 1
) t on co.course_id = t.cour
where s.student_id != 1;
13、查询和"01"号的同学学习的课程完全相同的其他同学的信息:
思路: 在12的基础上在加上条件 count("01"同学学的课程数相同)
查询"01"学生的课程信息:
select concat_ws('|', collect_set(cast(course_id as string))) courseId
from score where student_id = 1;
注:
- cast()函数将int转换成string
- collect_set(参数为string类型)函数列转行并去重
- concat_ws()函数拼接字符串,并制定分隔符为"|"
查询和"01"号的同学学习的课程完全相同的其他同学的信息:
select s.*
from student s
join(
select student_id,concat_ws('|',collect_set(cast(course_id as string))) courseId
from score
where student_id != 1
group by student_id
) t1
on s.student_id = t1.student_id
join(
select concat_ws('|',collect_set(cast(course_id as string))) courseId
from score
where student_id = 1
) t2
on t1.courseId = t2.courseId;
思路: student s left join (学过张三老师表的学生id)t on s.student_id = t.student_id where t.student_id is null
select s.student_name
from student s
left join(
select sc.student_id stuid
from score sc
join course co
on sc.course_id = co.course_id
join teacher te
on co.teacher_id = te.teacher_id
where teacher_name = '张三'
) t on s.student_id = t.stuid
where t.stuid is null;
15、查询两门及其以上不及格课程的同学的学号,姓名及其平均成绩:
思路: 查询学号,姓名,平均成绩 ,关联条件为(两门及其以上不及格课程的同学的学号)
select s.student_id,s.student_name,t1.avg_score
from student s
left join(
select student_id,round(avg(score),2) avg_score
from score
group by student_id
) t1 on s.student_id = t1.student_id
join(
select student_id,count(1) count_failed
from score
where score < 60
group by student_id
having count_failed >= 2
) t2 on t1.student_id = t2.student_id;
16、检索"01"课程分数小于60,按分数降序排列的学生信息:
select s.*,sc.course_id,sc.score
from student s
join score sc
on s.student_id = sc.student_id
where sc.course_id = 1 and sc.score < 60
order by sc.score desc;
17、按平均成绩从高到低显示所有学生的所有课程的成绩以及平均成绩:
with t as(
select student_id,avg(score) avg_score
from score
group by student_id
)
select sc.student_id,sc.score,t.avg_score
from score sc
join t
on sc.student_id = t.student_id
order by t.avg_score desc;
18、查询各科成绩最高分、最低分和平均分:以如下形式显示:课程ID,课程name,最
高分,最低分,平均分,及格率,中等率,优良率,优秀率:
with t as
(
select
course_id,
max(score) max_score,
min(score) min_score,
avg(score) avg_score,
count(1) count_course,
sum(case when score >= 60 then 1 else 0 end) jige,
sum(case when score >60 and score < 80 then 1 else 0 end) zhong,
sum(case when score >= 80 and score < 90 then 1 else 0 end) liang,
sum(case when score >= 90 then 1 else 0 end) you
from score
group by course_id
)
select
t.course_id,co.course_name,t.max_score,t.min_score,t.avg_score,
t.jige / t.count_course `及格率`,
t.zhong / t.count_course `中等率`,
t.liang / t.count_course `优良率`,
t.you / t.count_course `优秀率`
from t
join course co
on t.course_id = co.course_id;
select sc1.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc1 where course_id = 1 order by rank
union all
select sc2.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc2 where course_id = 2 order by rank
union all
select sc3.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc3 where course_id = 3 order by rank
;
with t as
(
select s.student_id,s.student_name,sum(sc.score) sum_score
from student s
left join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name
)
select
t.*,
dense_rank() over(order by sum_score desc) `rank` from t;
select te.teacher_name,sc.course_id,avg(sc.score) avg_score
from score sc
join course co
on sc.course_id = co.course_id
join teacher te
on co.teacher_id = te.teacher_id
group by te.teacher_name,sc.course_id
order by avg_score desc;
22、查询所有课程的成绩第2 名到第3 名的学生信息及该课程成绩:
select s.*,t.*
from student s
join
(
select *,
dense_rank() over(partition by course_id order by score desc) rn
from score
) t on s.student_id = t.student_id
where t.rn = 2 or t.rn = 3;
group by t.student_id,t.course_id;
23、统计各科成绩各分数段人数:课程编号,课程名称,[100-85],[85-70],[70-60],[0-60]及所
占百分比
with t as
(
select course_id,
count(1) count_course,
sum(case when score >= 85 then 1 else 0 end) youxiu,
sum(case when score >=70 and score <85 then 1 else 0 end) zhong,
sum(case when score >=60 and score <70 then 1 else 0 end) jige,
sum(case when score >=0 and score <60 then 1 else 0 end) bujige
from score
group by course_id
)
select co.course_id,co.course_name,
t.youxiu / t.count_course `[100-85]`,
t.zhong / t.count_course `[85-70]`,
t.jige / t.count_course `[70-60]`,
t.bujige / t.count_course `[0-60]`
from t
join course co
on t.course_id = co.course_id;
注:取别名[100-85]
with t as
(
select s.student_id,s.student_name,avg(sc.score) avg_score
from student s
left join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name
)
select
t.*,
dense_rank() over(order by avg_score desc) `rank`
from t;
with t as(
select sc1.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc1 where course_id = 1 order by rank
union all
select sc2.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc2 where course_id = 2 order by rank
union all
select sc3.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc3 where course_id = 3 order by rank
)
select * from t where rank <= 3;
select course_id,count(1)
from score
group by course_id;
with t as
(
select s.student_id,s.student_name,count(sc.course_id) count_course
from student s
join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name
)
select student_id,student_name from t where count_course = 2;
select student_sex,count(1) from student group by student_sex;
select * from student where student_name like '%风%';
select s1.student_name,count(1) same_name
from student s1,student s2
where s1.student_name = s2.student_name and s1.student_id != s2.student_id
group by s1.student_name;
select * from student
where year(student_birth) = 1990;
32、查询每门课程的平均成绩,结果按平均成绩降序排列,平均成绩相同时,按课程编号升序排列:
with t as
(
select course_id,avg(score) avg_score
from score
group by course_id
)
select
t.*,
dense_rank() over(order by avg_score desc,course_id asc)
from t;
33、查询平均成绩大于等于85的所有学生的学号、姓名和平均成绩:
select s.student_id,s.student_name,avg(sc.score) avg_score
from student s
join score sc
on s.student_id = sc.student_id
group by s.student_id,s.student_name
having avg_score >= 85;
34、查询课程名称为"数学",且分数低于60的学生姓名和分数:
select s.student_name,sc.score,co.course_id,course_name
from student s
left join score sc
on s.student_id = sc.student_id
left join course co
on sc.course_id = co.course_id
where course_name = '数学' and sc.score < 60 or sc.score is null;
select s.student_id,s.student_name,sc.course_id,co.course_name,sc.score
from student s
left join score sc
on s.student_id = sc.student_id
left join course co
on sc.course_id = co.course_id;
36、查询任何一门课程成绩在70分以上的学生姓名、课程名称和分数:
select s.student_name,co.course_name,sc.score
from student s
join score sc
on s.student_id = sc.student_id
join course co
on sc.course_id = co.course_id
where sc.score > 70;
select s.student_id,s.student_name,co.course_id,sc.score
from student s
left join score sc
on s.student_id = sc.student_id
left join course co
on sc.course_id = co.course_id
where sc.score < 60 or sc.score is null;
38、查询课程编号为01且课程成绩在80分以上的学生的学号和姓名:
select s.student_id,s.student_name,sc.course_id,sc.score
from student s
join score sc
on s.student_id = sc.student_id
where sc.course_id = 1 and sc.score > 80;
select co.course_id,co.course_name,count(1) count_course
from course co
left join score sc
on co.course_id = sc.course_id
group by co.course_id,co.course_name;
40、查询选修"张三"老师所授课程的学生中,成绩最高的学生信息及其成绩:
with t as
(
select s.*,sc.score,co.course_id,te.teacher_name,
dense_rank() over(partition by sc.course_id order by sc.score desc) rn
from student s
join score sc join course co join teacher te
on s.student_id = sc.student_id
and sc.course_id = co.course_id
and co.teacher_id = te.teacher_id
and te.teacher_name = "张三"
)
select t.* from t where t.rn = 1;
41、查询不同课程成绩相同的学生的学生编号、课程编号、学生成绩:
select sc1.student_id,sc1.course_id,sc1.score,sc2.student_id,sc2.course_id,sc2.score
from score sc1,score sc2
where sc1.course_id != sc2.course_id and sc1.score = sc2.score;
with t as(
select sc1.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc1 where course_id = 1 order by rank
union all
select sc2.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc2 where course_id = 2 order by rank
union all
select sc3.*,
dense_rank() over(partition by course_id order by score desc) `rank`
from score sc3 where course_id = 3 order by rank
)
select * from t where rank <= 3;
43、统计每门课程的学生选修人数(超过5人的课程才统计)要求输出课程号和选修人数,查询结果按人数降序排列,若人数相同,按课程号升序排列
select course_id,count(1) count_course
from score
group by course_id
having count_course > 5
order by count_course desc,course_id;
注: 分组有排序限制
select student_id,count(1) count_course
from score
group by student_id
having count_course >= 2;
select s.*
from student s
join
(
select sc.student_id,count(1) count_course
from score sc
group by sc.student_id
) t1
join
(
select count(1) all_course from course
) t2
on s.student_id = t1.student_id and t1.count_course = t2.all_course;
select *,
((year(current_date())-year(student_birth)) -
(case when month(current_date) < month(student_birth) then 1
when month(current_date) = month(student_birth) and day(current_date) < day(student_birth) then 1
else 0 end)
) age
from student;
select * from student where weekofyear(current_date) = weekofyear(student_birth);
select * from student where weekofyear(current_date)+1 = weekofyear(student_birth);
select * from student where month(current_date) = month(student_birth);
select * from student where month(student_birth) = 12;