MySQL 进阶查询:GROUP BY、HAVING 与 LIMIT 实战详解
(2025–2026 年视角,重点放在真实业务场景 + 常见陷阱 + 优化技巧)
这三个关键字是 SQL 中最常一起出现的“分组统计三人组”,也是面试、报表、数据分析、后台管理系统中最频繁被考察的部分。
一、最核心的三句话(建议背下来)
- GROUP BY:先把数据按某个字段(或多个字段)分组,然后对每组执行聚合函数(COUNT、SUM、AVG、MAX、MIN 等)
- HAVING:对分组之后的结果进行过滤(相当于分组后的 WHERE)
- LIMIT:对最终结果集进行行数限制(分页、Top N 的核心)
执行顺序口诀(非常重要,错了就容易写出错误 SQL):
FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
二、经典业务场景 + 对应写法
场景 1:统计每个城市的订单数量(最基础 GROUP BY)
SELECT
city,
COUNT(*) AS order_count,
SUM(amount) AS total_amount
FROM orders
WHERE order_date >= '2025-01-01'
GROUP BY city;
变体:按城市 + 订单状态双维度分组
SELECT
city,
status,
COUNT(*) AS cnt,
AVG(amount) AS avg_amount
FROM orders
GROUP BY city, status;
场景 2:找出订单金额 Top 10 的城市(GROUP BY + ORDER BY + LIMIT)
SELECT
city,
ROUND(SUM(amount), 2) AS total_revenue
FROM orders
WHERE YEAR(order_date) = 2025
GROUP BY city
ORDER BY total_revenue DESC
LIMIT 10;
场景 3:只显示订单数 ≥ 100 的城市(HAVING 的典型用法)
SELECT
city,
COUNT(*) AS order_count
FROM orders
GROUP BY city
HAVING order_count >= 100
ORDER BY order_count DESC;
关键点:HAVING 只能作用于聚合函数或分组字段
下面是错误的写法(会报错):
-- 错误!HAVING 里不能用非聚合字段(除非它也在 GROUP BY 中)
HAVING city = '上海' -- ×
-- 正确写法
HAVING city = '上海' -- 如果 city 在 GROUP BY 中,可以
或
WHERE city = '上海' -- 应该提前过滤,用 WHERE
场景 4:每个用户最近一次订单金额(分组 + 子查询 + LIMIT 1)
-- 写法一:窗口函数(MySQL 8.0+ 推荐,性能好)
SELECT
user_id,
order_id,
amount,
order_date
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY order_date DESC) AS rn
FROM orders
) t
WHERE rn = 1;
-- 写法二:GROUP BY + 子查询(兼容 5.7)
SELECT
o.user_id,
o.order_id,
o.amount
FROM orders o
INNER JOIN (
SELECT user_id, MAX(order_date) AS max_date
FROM orders
GROUP BY user_id
) m ON o.user_id = m.user_id AND o.order_date = m.max_date;
场景 5:分页查询 + 统计总数(真实后台列表必备)
-- 分页查询(第 3 页,每页 20 条)
SELECT *
FROM orders
WHERE status = 'PAID'
ORDER BY create_time DESC
LIMIT 40, 20; -- offset 40,limit 20
-- 同时返回总数(前端分页组件需要)
SELECT FOUND_ROWS() AS total; -- 配合 SQL_CALC_FOUND_ROWS(8.0+ 不推荐)
-- 更推荐的现代写法(分开两条 SQL)
-- 第一条:SELECT COUNT(*) FROM orders WHERE ...
-- 第二条:SELECT ... LIMIT ?, ?
三、常见陷阱与正确写法对照表
| 错误写法(会报错或结果不对) | 正确写法 | 原因说明 |
|---|---|---|
| SELECT city, amount FROM orders GROUP BY city; | SELECT city, SUM(amount) FROM orders GROUP BY city; | GROUP BY 后非分组列必须用聚合函数 |
| HAVING amount > 1000 | HAVING SUM(amount) > 1000 或 WHERE amount > 1000 | HAVING 只能用于聚合结果或分组列 |
| GROUP BY city HAVING city = ‘北京’ | WHERE city = ‘北京’ GROUP BY city | 过滤原始数据用 WHERE,过滤分组后结果用 HAVING |
| SELECT city, COUNT(*) FROM orders LIMIT 10; | 先 GROUP BY 再 LIMIT | LIMIT 在 GROUP BY 之后生效 |
| GROUP BY DATE_FORMAT(create_time, ‘%Y-%m’) | GROUP BY DATE_FORMAT(create_time, ‘%Y-%m’) 或 用别名后 HAVING | MySQL 允许 GROUP BY 表达式,但建议清晰 |
| GROUP BY 1, 2 | GROUP BY city, status | 生产环境禁用 GROUP BY 序号(可读性差、易出错) |
四、性能优化建议(真实项目高频点)
- GROUP BY 字段一定要有索引
尤其是高基数字段(city、user_id 等) - WHERE 条件尽量先过滤
把能大幅减少记录数的条件写在 WHERE,而不是 HAVING - 避免在 GROUP BY 上使用函数(会导致无法走索引)
错误:GROUP BY YEAR(order_date)
推荐:新建计算列或用范围查询 - 分页优化(大表深分页)
-- 推荐:使用覆盖索引 + 延迟关联
SELECT o.*
FROM orders o
INNER JOIN (
SELECT id FROM orders
WHERE status = 'PAID'
ORDER BY create_time DESC
LIMIT 100000, 20
) t ON o.id = t.id;
- HAVING COUNT(*) > N 优化
可以先用子查询或临时表提前过滤
五、2025–2026 年面试/项目高频题快速记忆
- GROUP BY 与 WHERE、HAVING 的执行先后顺序?
- 为什么 HAVING 里能用聚合函数,而 WHERE 不行?
- 如何实现“每个分类销量 Top 3 商品”?(窗口函数 vs 自连接)
- 大表分页第 10000 页怎么优化?
- GROUP BY 后怎么同时返回明细和汇总?(WITH ROLLUP / 窗口函数)
一句话总结:
GROUP BY 是分组,HAVING 是分组后的 WHERE,LIMIT 是最终截断。 写 SQL 时永远记住“先过滤、再分组、后截取”的逻辑顺序。
重阳,你现在是在做哪类统计需求?
- 商品分类销量 Top N
- 用户行为漏斗分析
- 日活/月活统计
- 还是后台分页列表 + 总数?
告诉我具体场景,我可以给你最贴合的完整 SQL + 索引建议~