MySQL GROUP BY 和 HAVING 详解
1. GROUP BY 基础
1.1 基本用法
GROUP BY
子句用于将查询结果按照指定的列进行分组,通常与聚合函数一起使用。
-- 按部门分组,计算每个部门的平均工资
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department;
1.2 多列分组
可以对多个列进行分组,创建更复杂的分组策略。
-- 按部门和职位分组
SELECT
department,
job_title,
COUNT(*) as employee_count
FROM employees
GROUP BY department, job_title;
2. HAVING 子句
2.1 过滤分组
HAVING
子句用于对分组后的结果进行条件过滤,类似于 WHERE
,但专门用于分组后的筛选。
-- 只显示平均工资超过 50000 的部门
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
HAVING avg_salary > 50000;
2.2 复杂条件过滤
HAVING
可以使用更复杂的条件和聚合函数。
-- 显示员工数大于 5 且平均工资超过 60000 的部门
SELECT
department,
AVG(salary) as avg_salary,
COUNT(*) as employee_count
FROM employees
GROUP BY department
HAVING
employee_count > 5 AND
avg_salary > 60000;
3. GROUP BY 与 WHERE 的结合
3.1 先过滤再分组
可以先使用 WHERE
过滤数据,再使用 GROUP BY
分组。
-- 先筛选 2022 年的数据,再按部门分组
SELECT
department,
SUM(sales_amount) as total_sales
FROM sales
WHERE sale_date >= '2022-01-01'
GROUP BY department;
4. 高级分组技巧
4.1 WITH ROLLUP
使用 WITH ROLLUP
可以生成小计和总计行。
-- 生成部门销售额的小计和总计
SELECT
department,
product_category,
SUM(sales_amount) as total_sales
FROM sales
GROUP BY department, product_category WITH ROLLUP;
4.2 分组排序
可以对分组结果进行排序。
-- 按平均工资降序排列部门
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;
5. 常见陷阱和最佳实践
5.1 SELECT 列限制
在使用 GROUP BY
时,SELECT
中的非聚合列必须出现在 GROUP BY
子句中。
-- 正确示例
SELECT
department,
job_title,
AVG(salary) as avg_salary
FROM employees
GROUP BY department, job_title;
-- 错误示例(job_title 未在 GROUP BY 中)
SELECT
department,
job_title, -- 这将导致错误
AVG(salary) as avg_salary
FROM employees
GROUP BY department;
5.2 性能考虑
对大型数据集进行分组可能会影响性能,建议:
- 创建适当的索引
- 避免不必要的复杂分组
- 使用
EXPLAIN
分析查询执行计划
6. 实际应用场景
6.1 电商销售分析
-- 按产品类别和月份分析销售情况
SELECT
YEAR(sale_date) as sale_year,
MONTH(sale_date) as sale_month,
product_category,
SUM(sales_amount) as total_sales,
COUNT(order_id) as order_count
FROM sales
GROUP BY
YEAR(sale_date),
MONTH(sale_date),
product_category
HAVING total_sales > 10000
ORDER BY sale_year, sale_month, total_sales DESC;
6.2 人力资源分析
-- 分析不同年龄段的员工情况
SELECT
CASE
WHEN age < 25 THEN '25岁以下'
WHEN age BETWEEN 25 AND 35 THEN '25-35岁'
WHEN age BETWEEN 36 AND 45 THEN '36-45岁'
ELSE '45岁以上'
END as age_group,
AVG(salary) as avg_salary,
COUNT(*) as employee_count
FROM employees
GROUP BY age_group
HAVING employee_count > 10
ORDER BY avg_salary DESC;
7. 分页查询
7.1 基本分页
使用 LIMIT
和 OFFSET
实现简单分页:
-- 查询第2页,每页10条记录(假设第1页从0开始)
SELECT
department,
AVG(salary) as avg_salary,
COUNT(*) as employee_count
FROM employees
GROUP BY department
LIMIT 10 OFFSET 10;
7.2 带排序的分页
-- 按平均工资降序排列,获取第3页的数据
SELECT
department,
AVG(salary) as avg_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC
LIMIT 10 OFFSET 20;
总结
GROUP BY
用于数据分组和聚合HAVING
用于对分组结果的条件过滤- 注意
SELECT
列的限制 - 合理使用可以进行复杂的数据分析