1-4 HAVING子句的力量
用HAVING子句进行子查询:求众数
使用谓词:
SELECT income, COUNT(*) AS cnt
FROM Graduates
GROUP BY income
HAVING COUNT(*) >= ALL(SELECT COUNT(*)
FROM Graduates
GROUP BY income);
使用极值函数:
SELECT income, COUNT(*) AS cnt
FROM Graduates
GROUP BY income
HAVING COUNT(*) >= (SELECT MAX(cnt)
FROM (SELECT COUNT(*) AS cnt
FROM Graduates
GROUP BY income) TMP);
用HAVING子句进行自连接:求中位数
求Graduates表中income的中位数
思路:
SELECT AVG(DISTINCT income)
FROM (SELECT T1.income
FROM Graduates T1, Graduates T2
GROUP BY T1.income
--S1的条件
HAVING SUM(CASE WHEN T2.income >= T1.income
THEN 1 ELSE 0 END)
>= COUNT(*) / 2
--S2的条件
AND SUM(CASE WHEN T2.income <= T1.income
THEN 1 ELSE 0 END)
>= COUNT(*) / 2) TMP;
查询不包含NULL的集合
COUNT函数的结果根据参数的不同而不同。COUNT(*)会得到包含NULL的数据行数,而COUNT(<列名>)会得到NULL之外的数据行数。可以利用该特性查询不包含NULL的集合。
有一张存储了学生提交报告的日期的表Students,学生提交报告后,“提交日期”列会被写入日期,而提交之前是NULL。现在需要从这张表中找出哪些学院的学生全部提交了报告。
--使用COUNT函数
SELECT dpt
FROM Students
GROUP BY dpt
HAVING COUNT(*) = COUNT(sbmt_date);
--使用CASE表达式
SELECT dpt
FROM Students
GROUP BY dpt
HAVING COUNT(*) = SUM(CASE WHEN sbmt_date IS NOT NULL THEN 1 ELSE 0 END);
用关系除法运算进行购物篮分析
假设有两张表:全国连锁折扣店的商品表Items,以及各个店铺的库存管理表ShopItems。
要查询囊括了表Items中所有商品的店铺(带余除法,division with a remainder)
SELECT SI.shop
FROM ShopItems SI, Items I
WHERE SI.item = I.item
GROUP BY SI.shop
HAVING COUNT(SI.item) = (SELECT COUNT(item) FROM Items);
WHERE SI.item = I.item
过滤掉了ShopItems表中在Items表中不存在的商品数据,然后按店铺聚合函数,选出店铺中商品种数等于Items表中商品种数的店铺。
要查询囊括了表Items中所有商品的店铺且没有多余商品种类的店铺(精确关系除法,exact relational division)
SELECT SI.shop
FROM ShopItems SI LEFT OUTER JOIN Items I
ON SI.item = I.item
GROUP BY SI.shop
HAVING COUNT(SI.item) = (SELECT COUNT(item) FROM Items)
AND COUNT(I.item) = (SELECT COUNT(item) FROM Items)
表ShopItems和表Items外连接后的结果
练习题1-4-2:练习“特征函数”
像CASE表达式这样用来判断各个元素(=行)是否属于满足了某种条件的集合的函数称为特征函数。 请写出查询“全体学生都在9月份提交了报告的学院”的SQL语句。
SELECT dpt
FROM Students
GROUP BY dpt
HAVING count(*) =
SUM(CASE WHEN EXTRACT(YEAR FROM sbmt_date) = 2005
AND EXTRACT(MONTH FROM sbmt_date) = 09 THEN 1 ELSE 0 END);
1-5 外连接的用法
用外连接进行行列转换(1)(行→列):制作交叉表
利用上表生成下面的交叉表
使用外连接:
SELECT C0.name,
CASE WHEN C1.name IS NOT NULL
THEN 'o' ELSE NULL END AS "SQL入门",
CASE WHEN C2.name IS NOT NULL
THEN 'o' ELSE NULL END AS "UNIX基础",
CASE WHEN C3.name IS NOT NULL
THEN 'o' ELSE NULL END AS "Java中级",
FROM (SELECT DISTINCT name FROM Courses) C0
LEFT OUTER JOIN
(SELECT name FROM Courses WHERE course='SQL入门') C1
ON C0.name = C1.name
LEFT OUTER JOIN
(SELECT name FROM Courses WHERE course='UNIX基础') C2
ON C0.name = C2.name
LEFT OUTER JOIN
(SELECT name FROM Courses WHERE course='Java中级') C3
ON C0.name = C3.name;
使用标量子查询:
SELECT C0.name,
(SELECT 'o'
FROM Courses C1
WHERE course = 'SQL入门'
AND C1.name = C0.name) AS “SQL入门“,
(SELECT 'o'
FROM Courses C2
WHERE course = 'UNIX基础'
AND C2.name = C0.name) AS “UNIX基础“,
(SELECT 'o'
FROM Courses C3
WHERE course = 'Java中级'
AND C3.name = C0.name) AS “Java中级“
FROM (SELECT DISTINCT name FROM Courses) C0;
嵌套使用CASE表达式:
SELECT name,
CASE WHEN SUM(CASE WHEN course = 'SQL入门'
THEN 1 ELSE NULL END) = 1
THEN 'o' ELSE NULL END AS "SQL入门",
CASE WHEN SUM(CASE WHEN course = 'UNXI基础'
THEN 1 ELSE NULL END) = 1
THEN 'o' ELSE NULL END AS "UNXI基础",
CASE WHEN SUM(CASE WHEN course = 'Java中级'
THEN 1 ELSE NULL END) = 1
THEN 'o' ELSE NULL END AS "Java中级"
FROM Courses
GROUP BY name;
用外连接进行行列转换(2)(列→行):汇总重复项于一列
使用UNION ALL将列数据转换为行数据:
SELECT employee, child_1 AS child FROM Personnel
UNION ALL
SELECT employee, child_2 AS child FROM Personnel
UNION ALL
SELECT employee, child_3 AS child FROM Personnel;
执行结果如下:
employee child
赤井 一郎
赤井 二郎
赤井 三郎
工藤 春子
工藤 夏子
工藤
铃木 夏子
铃木
铃木
吉田
吉田
吉田
如果要排除掉“child”列为NULL的行,但是保留没有孩子的吉田,像下面的结果这样:
employee child
赤井 一郎
赤井 二郎
赤井 三郎
工藤 春子
工藤 夏子
铃木 夏子
吉田
先生成一个存储子女列表的视图:
CREATE VIEW Children(child)
AS SELECT child_1 FROM Personnel
UNION
SELECT child_2 FROM Personnel
UNION
SELECT child_3 FROM Personnel;
接下来,以员工列表为主表进行外连接操作:
SELECT EMP.employee, CHILDREN.child
FROM Personnel EMP
LEFT OUTER JOIN Children
ON CHILDREN.child IN (EMP.child_1, EMP.child_2, EMP.child_3);
在交叉表里制作嵌套式表侧栏
根据上述三张表生成包含嵌套式表侧栏的统计表:
SELECT MASTER.age_class AS age_class,
MASTER.sex_cd AS sex_cd,
DATA.pop_tohoku AS pop_tohoku,
DATA.pop_kanto AS pop_kanto
FROM (SELECT age_class, sex_cd
FROM TblAge CROSS JOIN TblSex) MASTER
LEFT OUTER JOIN
(SELECT age_class, sex_cd,
SUM(CASE WHEN pref_name IN ('青森','秋田')
THEN population ELSE NULL END) AS pop_tohoku,
SUM(CASE WHEN pref_name IN ('东京','千叶')
THEN population ELSE NULL END) AS pop_kanto
FROM TblPop
GROUP BY age_class, sex_cd) DATA
ON MASTER.age_class = DATA.age_class
AND MASTER.sex_cd = DATA.sex_cd;
需要生成嵌套式表侧栏时,事先按照需要的格式准备好主表即可。
- 对于不支持
CROSS JOIN
语句的数据库,可以像FROM TblAge, TblSex
这样不指定连接条件,把需要连接的表写在一起,其效果与交叉连接一样。
作为乘法运算的连接
使用以上两张表生成下面的结果:
item_no total_qty
10 36
20 32
30 22
40
方法1:通过在连接前聚合来创建一对一的关系
SELECT I.item_no, SH.total_qty
FROM Items I LEFT OUTER JOIN
(SELECT item_no, SUM(quantity) AS total_qty
FROM SalesHistory
GROUP BY item_no) SH
ON I.item_no = SH.item_no;
方法2:先进行一对多的连接再聚合
SELECT I.item_no, SUM(SH.quantity) AS total_qty
FROM Items I LEFT OUTER JOIN SalesHistory SH
ON I.item_no = SH.item_no
GROUP BY I.item_no;
1-6 用关联子查询比较行与行
增长、减少、维持现状
根据上表的数据,使用SQL输出与上一年相比营业额是增加、减少抑或是不变。
--使用关联子查询
SELECT S1.year, S1.sale,
CASE WHEN sale =
(SELECT sale
FROM Sales S2
WHERE S2.year = S1.year - 1) THEN '→'
WHEN sale >
(SELECT sale
FROM Sales S2
WHERE S2.year = S1.year - 1) THEN '↑'
WHEN sale <
(SELECT sale
FROM Sales S2
WHERE S2.year = S1.year - 1) THEN '↓'
ELSE '-' END AS var
FROM Sales S1
ORDER BY year;
--使用自连接
SELECT S1.year, S1.sale,
CASE WHEN S1.sale = S2.sale THEN '→'
WHEN S1.sale > S2.sale THEN '↑'
WHEN S1.sale < S2.sale THEN '↓'
FROM Sales S1, Sales S2
WHERE S2.year = S1.year - 1
ORDER BY year;
执行结果:
时间轴有间断时:和过去最临近的时间进行比较
Sale2表中丢失了过去个别年份的数据。找出与过去最临近的年份营业额相同的年份。
--使用关联子查询
SELECT year, sale
FROM Sales2 S1
WHERE sale =
(SELECT sale
FROM Sale2 S2
WHERE S2.year =
(SELECT MAX(year)
FROM Sales2 S3
WHERE S1.year > S3.year))
ORDER BY year;
用自连接,可以减少一层子查询的嵌套
SELECT S1.year AS year,
S1.sale AS sale
FROM Sales2 S1, Sales2 S2
WHERE S1.sale = S2.sale
AND S2.year = (SELECT MAX(year)
FROM Sales2 S3
WHERE S1.year > S3.year)
ORDER BY year;
移动累计值和移动平均值
求截止到每个处理日期的处理金额的累计值。
使用窗口函数:
SELECT prc_date, prc_amt,
SUM(prc_amt) OVER (ORDER BY prc_date) AS onhand_amt
FROM Accounts;
使用冯·诺伊曼递归集合:
SELECT prc_date, A1.prc_amt,
(SELECT SUM(prc_amt)
FROM Accounts A2
WHERE A1.prc_date >= A2.prc_date) AS onhand_amt
FROM Accounts A1
ORDER BY prc_date;
以3次处理为单位求移动累计值,如下表所示
使用窗口函数:
SELECT prc_date, prc_amt,
SUM(prc_amt) OVER (ORDER BY prc_date
ROWS 2 PRECEDING) AS onhand_amt
FROM Accounts;
使用标量子查询:
SELECT prc_date, A1.prc_amt,
(SELECT SUM(prc_amt)
FROM Accounts A2
WHERE A1.prc_date >= A2
AND (SELECT COUNT(*)
FROM Accounts A3
WHERE A3.prc_date
BETWEEN A2.prc_date AND A1.prc_date) <= 3)
AS mvg_sum
FROM Accounts A1
ORDER BY prc_date;
查询重叠的时间区间
该表是某一房间在某段期间内的预约情况,要查询出住宿日期重叠的客人并列表显示。
SELECT reserver, start_date, end_date
FROM Reservations R1
WHERE EXISTS
(SELECT *
FROM Reservations R2
WHERE R1.reserver <> R2.reserver
--与自己之外的客人进行比较
AND (R1.start_date BETWEEN
R2.start_date AND R2.end_date
--条件1:自己的入住日期在他人的住宿期间内
OR R1.end_date BETWEEN
R2.start_date AND R2.end_date));
--条件2:自己的离店日期在他人的住宿期间内
1-7 用SQL进行集合运算
比较表和表:检查集合相等性之基础篇
假如有两张表tbl_A和tbl_B,结构相同,要看两者内容是否相同:
SELECT COUNT(*) AS row_cnt
FROM (SELECT *
FROM tbl_A
UNION
SELECT *
FROM tbl_B) TMP;
如果两张表行数都是3,上述SQL语句执行结果也为3,则说明两张表是相等的。
- 对于任意的表S,都有下面的公式成立:
S UNION S = S
UNION运算是幂等的。
比较表和表:检查集合相等性之进阶篇
利用(A∪B)=(A∩B)⇔(A=B)
SELECT CASE WHEN COUNT(*) = 0
THEN '相等'
ELSE '不相等' END AS result
FROM ((SELECT * FROM tbl_A
UNION
SELECT * FROM tbl_B)
EXCEPT
(SELECT * FROM tbl_A
INTERSECT
SELECT * FROM tbl_B)) TMP;
用差集实现关系除法运算
从表EmpSkills中找出精通表Skills中所有技术的员工:
SELECT DISTINCT emp
FROM EmpSkills ES1
WHERE NOT EXISTS
(SELECT skill
FROM Skills
EXCEPT
SELECT skill
FROM EmpSkills ES2
WHERE ES1.emp = ES2.emp);
针对每个员工进行集合运算,从需求的技术的集合中减去每个员工自己的技术的集合,如果结果是空集,则说明该员工具备所有的需求的技术,否则说明该员工不具备某些需求的技术。
寻找相等的子集
要找出经营的零件在种类数和种类上都完全相同的供应商组合。
SELECT SP1.sup AS s1, SP2.sup AS s2
FROM SupParts SP1, SupParts SP2
WHERE SP1.sup < SP2.sup -- 生成供应商的全部组合
AND SP1.part = SP2.part -- 条件1:经营同种类型的零件
GROUP BY SP1.sup, SP2.sup
HAVING COUNT(*) = (SELECT COUNT(*)
FROM SupParts SP3
WHERE SP3.sup = SP1.sup)
AND COUNT(*) = (SELECT COUNT(*)
FROM SupParts SP4
WHERE SP4.sup = SP2.sup);
-- 条件2:经营的零件种类数相同
练习题1-7-2:精确关系除法运算
在“用差集实现关系除法运算”部分,我们学习了将除法还原成减法来运算的方法。请修改这条SQL语句,实现“精确关系除法运算”,选择刚好拥有全部技术的员工。
SELECT DISTINCT emp
FROM EmpSkills ES1
WHERE NOT EXISTS
(SELECT skill
FROM Skills
EXCEPT
SELECT skill
FROM EmpSkills ES2
WHERE ES1.emp = ES2.emp)
AND NOT EXISTS
SELECT skill
FROM EmpSkills ES2
WHERE ES1.emp = ES2.emp)
EXCEPT
SELECT skill
FROM Skills);
之前使用有余数的除法运算时,员工可以掌握被要求的技术之外的其他技术,而这次我们查询的是掌握的技术和所要求的技术完全一致的员工,所以不仅要求EmpSkills-Skills
是空集,同时也要求Skills-EmpSkills
是空集。 第二种解法是通过员工掌握的技术数目来匹配,代码如下:
SELECT emp
FROM EmpSkills ES1
WHERE NOT EXISTS
(SELECT skill
FROM Skills
EXCEPT
SELECT skill
FROM EmpSkills ES2
WHERE ES1.emp = ES2.emp)
GROUP BY emp
HAVING COUNT(*) = (SELECT COUNT(*) FROM Skills);