p41 聚合函数√
聚合函数:输入一系列值并聚合为一个结果的函数。
聚合函数只运行非空值,如果列中有空值,不会被算在函数里。
USE sql_invoicing;
SELECT -- SELECT选择的不仅可以是列,也可以是列间表达式、列的聚合函数
MAX(invoice_date) latest_date,
MIN(invoice_total) lowest,
AVG(invoice_total) average,
SUM(invoice_total * 1.1) total, -- 可以统计表达式
COUNT(*) total_records, -- 统计所有记录数目
COUNT(invoice_total) number_of_invoices, -- 和上一个相等
COUNT(payment_date) number_of_payments, -- 只统计了结账了的记录数,因为忽略了空值(空值不会被计数)
COUNT(DISTINCT client_id) number_of_distinct_clients -- DISTINCT client_id 筛掉了该列的重复值,再COUNT计数,会得到付费了的顾客数
FROM invoices
WHERE invoice_date > '2019-07-01'; -- 想只统计下半年的结果
练习
目标:
date_range | total_sales | total_payments | what_we_expect (the difference) |
---|---|---|---|
1st_half_of_2019 | |||
2nd_half_of_2019 | |||
Total |
USE sql_invoicing;
SELECT
'1st_half_of_2019' AS date_range, -- 每行给个别名
SUM(invoice_total) AS total_sales,
SUM(payment_total) AS total_payments,
SUM(invoice_total - payment_total) AS what_we_expect
FROM invoices
WHERE invoice_date BETWEEN '2019-01-01' AND '2019-06-30'
UNION
SELECT
'2st_half_of_2019' AS date_range,
SUM(invoice_total) AS total_sales,
SUM(payment_total) AS total_payments,
SUM(invoice_total - payment_total) AS what_we_expect
FROM invoices
WHERE invoice_date BETWEEN '2019-07-01' AND '2019-12-31'
UNION
SELECT
'Total' AS date_range,
SUM(invoice_total) AS total_sales,
SUM(payment_total) AS total_payments,
SUM(invoice_total - payment_total) AS what_we_expect
FROM invoices
WHERE invoice_date BETWEEN '2019-01-01' AND '2019-12-31';
p42 Group by子句√
把每个列按照group by 给定的client_id字段分组,再然后用各种聚合函数去处理这些分组,返回每个client_id对应的行。
之前学习了如何用聚合函数汇总函数,但我们想知道每个客户的总销售额怎么办?那就需要对一列或多列分组了。
SELECT
client_id,
SUM(invoice_total) as total_sales
From invoices
group by client_id;
SELECT
state,
city,
SUM(invoice_total) as total_sales
From invoices i
join clients using(client_id)
group by state, city -- 也可以根据多列分组,比如统计一个州不同城市的销售额 多列分组就是state city必须都要相同才作为一组
练习
在 payments 表中,按日期和支付方式分组统计总付款额
每个分组显示一个日期和支付方式的独立组合,可以看到某特定日期特定支付方式的总付款额。这个例子里每一种支付方式可以在不同日子里出现,每一天也可以出现多种支付方式。
USE sql_invoicing;
SELECT
date,
pm.name AS payment_method, -- 每一行是日期和支付方式的不同组合
SUM(amount) AS total_payments
FROM payments p
JOIN payment_methods pm
ON p.payment_method = pm.payment_method_id
GROUP BY date, payment_method
-- 用的是 SELECT 里的列别名
ORDER BY date;
43 Having子句√
HAVING 和 WHERE 都是是条件筛选语句。
- where是筛选 分组之前的数据,但必须用原表列名而不能用SELECT中给定的列别名。而having是筛选分组之后的数据。
- HAVING …… 对 SELECT…GROUP BY… 查询后(通常是分组并聚合查询后)的结果列进行筛选。
-- 统计某个 总销售额大于500,发票数量大与5 的客户
USE sql_invoicing;
SELECT
client_id,
SUM(invoice_total) AS total_sales,
COUNT(*) AS number_of_invoices
FROM invoices
GROUP BY client_id
HAVING total_sales > 500 AND number_of_invoices > 5;
练习
在 sql_store 数据库(有顾客表、订单表、订单项目表等)中,找出在 ‘VA’ 州且消费总额超过100美元的顾客(这是一个面试级的问题,还很常见)
USE sql_store;
SELECT
c.customer_id,
c.first_name,
c.last_name,
SUM(oi.quantity * oi.unit_price) AS total_sales
FROM customers c
JOIN orders o USING(customer_id)
JOIN order_items oi USING(order_id)
WHERE state = 'VA'
GROUP BY
c.customer_id,
c.first_name,
c.last_name
HAVING total_sales > 100;
p44 ROLLUP运算符(感觉不常用)
GROUP BY …… WITH ROLL UP自动汇总型分组,若是多字段分组的话汇总也会是多层次的,注意这是MySQL扩展语法,不是SQL标准语法。
案例
分组查询各客户的发票总额以及所有人的总发票额。
USE sql_invoicing;
SELECT
client_id,
SUM(invoice_total)
FROM invoices
GROUP BY client_id WITH ROLLUP;
多字段分组
例1:分组查询各州、市的总销售额(发票总额)以及州层次和全国层次的两个层次的汇总额。
SELECT
state,
city,
SUM(invoice_total) AS total_sales
FROM invoices
JOIN clients USING (client_id)
GROUP BY state, city WITH ROLLUP;
例2:分组查询特定日期特定付款方式的总支付额以及单日汇总和整体汇总。
USE sql_invoicing;
SELECT
date,
pm.name AS payment_method,
SUM(amount) AS total_payments
FROM payments p
JOIN payment_methods pm
ON p.payment_method = pm.payment_method_id
GROUP BY date, pm.name WITH ROLLUP
练习
分组计算各个付款方式的总付款 并汇总。
SELECT
pm.name AS payment_method,
SUM(amount) AS total
FROM payments p
JOIN payment_methods pm
ON p.payment_method = pm.payment_method_id
GROUP BY pm.name WITH ROLLUP