– 1、查询"01"课程比"02"课程成绩高的学生的信息及课程分数:
SELECT s.*,s1.c_id,s1.s_score,s2.c_id,s2.s_score
FROM student s
join score s1 on s.s_id=s1.s_id and s1.c_id='01'
join score s2 on s.s_id=s2.s_id and s2.c_id='02'
where s1.s_score > s2.s_score
思路:
先构建表,join一个合适的大表后会方便我们进行查询操作
在这个表中可以轻松的进行01,02课程分数的比较
自己一开始写出来的sql,但是sql出现了无法执行的问题?
出现的问题:
SELECT s_name,s_birth,s_sex,c_id,s_score
from student LEFT JOIN score on student.s_id=score.s_id
where student.s_id=(
SELECT s1.s_id FROM score s1 join score s2 on s1.s_id=s2.s_id
where s1.c_id='01' and s2.c_id='02' and s1.s_score>s2.s_score
)
and c_id='01' or c_id='02'
明明两个子句都可以查询到结果,但是结合到一起就不行了?s_id有两个值,这里只能用IN
Error: Error while compiling statement: FAILED: ParseException line 4:0 cannot recognize input near 'SELECT' 's1' '.' in expression specification (state=42000,code=40000)
SELECT s1.s_id FROM score s1 join score s2 on s1.s_id=s2.s_id
where s1.c_id='01' and s2.c_id='02' and s1.s_score>s2.s_score
SELECT s_name,s_birth,s_sex,c_id,s_score
from student LEFT JOIN score on student.s_id=score.s_id
加入IN但是又出现新问题,这是为什么?
SELECT s_name,s_birth,s_sex,c_id,s_score
from student LEFT JOIN score on student.s_id=score.s_id
where student.s_id IN (
SELECT s1.s_id FROM score s1 join score s2 on s1.s_id=s2.s_id
where s1.c_id='01' and s2.c_id='02' and s1.s_score>s2.s_score
)
and c_id='01' or c_id='02'
Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 3:19 Unsupported SubQuery Expression 's_id': Only SubQuery expressions that are top level conjuncts are allowed (state=42000,code=10249)
不支持的子查询表达式“s_id”:只允许是顶级连接的子查询表达式
hive中子查询语句只能出现在from子句中,其他地方目前不允许
在mysql中是能够成功执行的
但是怎么全查出来了?我最后的条件and or 没加括号
SELECT s_name,s_birth,s_sex,c_id,s_score
from student LEFT JOIN score on student.s_id=score.s_id
where student.s_id IN (
SELECT s1.s_id FROM score s1 join score s2 on s1.s_id=s2.s_id
where s1.c_id='01' and s2.c_id='02' and s1.s_score>s2.s_score
)
and (c_id='01' or c_id='02')
– 2、查询"01"课程比"02"课程成绩低的学生的信息及课程分数:
– 3、查询平均成绩大于等于60分的同学的学生编号和学生姓名和平均成绩:
SELECT student.s_id,s_name,ROUND(AVG(s_score),1) avg
FROM student join score on student.s_id = score.s_id
GROUP BY student.s_id HAVING avg>=60
思路:
要查询的数据在student表和score表中,先进行多表连接,之后根据学生s_id进行分组求平均值
mysql中可以成功执行
hive中遇到的问题:
Error: Error while compiling statement: FAILED: SemanticException [Error 10025]: Line 3:29 Expression not in GROUP BY key '60' (state=42000,code=10025)
问题原因
在 Group by 子句中,Select 查询的列,要么需要是 Group by 中的列,要么得是用聚合函数(比如 sum、count 等)加工过的列。不支持直接引用非 Group by 的列。
修改:
SELECT student.s_id,s_name,ROUND(AVG(s_score),1) avg
FROM student join score on student.s_id = score.s_id
GROUP BY student.s_id HAVING ROUND(AVG(s_score),1)>=60
仍然报错
Error: Error while compiling statement: FAILED: SemanticException [Error 10025]: Line 1:20 Expression not in GROUP BY key 's_name' (state=42000,code=10025)
问题原因
在 Group by 子句中,Select 查询的列,要么需要是 Group by 中的列,要么得是用聚合函数(比如 sum、count 等)加工过的列。不支持直接引用非 Group by 的列。
s_name没有出现在group by中
解决:
SELECT student.s_id,s_name,ROUND(AVG(s_score),1) avg
FROM student join score on student.s_id = score.s_id
GROUP BY student.s_id,s_name HAVING ROUND(AVG(s_score),1)>=60
– 4、查询平均成绩小于60分的同学的学生编号和学生姓名和平均成绩:
– (包括有成绩的和无成绩的)
SELECT student.s_id,s_name,ROUND(AVG(s_score),1) avg
FROM student left join score on student.s_id = score.s_id
GROUP BY student.s_id,s_name HAVING ROUND(AVG(s_score),1)<60
但是会漏查王菊
修改:
SELECT s1.s_id,s_name,ROUND(AVG(s_score),1) avg
FROM student s1 left join score sc1 on s1.s_id = sc1.s_id
GROUP BY s1.s_id,s_name HAVING ROUND(AVG(s_score),1)<60
union all
SELECT s2.s_id,s_name,0 avg
FROM student s2
WHERE s2.s_id NOT IN (SELECT DISTINCT sc2.s_id FROM score sc2)
养成一个习惯,sql中写清楚每个字段所属的表
– 5、查询所有同学的学生编号、学生姓名、选课总数、所有课程的总成绩:
SELECT s1.s_id,s_name,COUNT(c_id) count,SUM(s_score) sum
FROM student s1 left JOIN score sc1 on s1.s_id=sc1.s_id
GROUP BY s1.s_id,s_name
后续部分参见: