PostgreSQL HAVING 子句

PostgreSQL HAVING 子句完全指南

HAVING 是聚合后的“守门员”,专门过滤 分组结果GROUP BY 后的每组)。
WHERE 不同:

  • WHERE → 过滤原始行
  • HAVING → 过滤聚合结果

一、HAVING 基本语法

SELECT 
    grouping_column,
    AGGREGATE_FUNCTION(...)
FROM table_name
[WHERE ...]           -- 过滤原始行
GROUP BY grouping_column
HAVING condition;     -- 过滤分组结果

核心规则

  • HAVING 必须与 GROUP BY 一起使用
  • 只能在 HAVING 中使用 聚合函数分组列

二、准备测试数据

CREATE TABLE sales (
    id BIGSERIAL PRIMARY KEY,
    product TEXT,
    category TEXT,
    region TEXT,
    amount NUMERIC(12,2),
    sale_date DATE,
    customer_id BIGINT
);

-- 插入多样化数据
INSERT INTO sales (product, category, region, amount, sale_date, customer_id) VALUES
('iPhone 15', 'Electronics', 'North', 1200, '2025-01-15', 101),
('T-Shirt',   'Clothing',    'North',  25, '2025-01-16', 102),
('MacBook',   'Electronics', 'South', 2500, '2025-01-17', 103),
('Jeans',     'Clothing',    'South',  80, '2025-01-18', 104),
('AirPods',   'Electronics', 'North',  250, '2025-01-19', 101),
('Jacket',    'Clothing',    'West',  150, '2025-01-20', 105),
('iPad',      'Electronics', 'North',  800, '2025-01-21', 106),
('Sneakers',  'Clothing',    'South',  120, '2025-01-22', 102);

三、WHERE vs HAVING:核心区别

过滤对象子句示例
原始行WHEREWHERE amount > 1000
聚合结果HAVINGHAVING COUNT(*) > 5
-- 错误!WHERE 不能用聚合函数
SELECT region, SUM(amount)
FROM sales
WHERE COUNT(*) > 2        -- 语法错误!
GROUP BY region;

-- 正确写法
SELECT region, SUM(amount) AS total
FROM sales
GROUP BY region
HAVING COUNT(*) > 2;

四、基础 HAVING 用法

需求SQL
订单数 > 3 的地区HAVING COUNT(*) > 3
总销售额 > 2000 的品类HAVING SUM(amount) > 2000
平均订单 > 500 的组合HAVING AVG(amount) > 500
-- 订单数 > 2 且总销售额 > 1500 的地区
SELECT region, 
       COUNT(*) AS order_count,
       SUM(amount) AS total_sales
FROM sales
GROUP BY region
HAVING COUNT(*) > 2 
   AND SUM(amount) > 1500;

输出

 region | order_count | total_sales
--------+-------------+-------------
 North  |           4 |       2275

五、复杂 HAVING 条件

-- 订单数 > 3,且最大单笔 > 1000,且平均订单 ≠ 0
SELECT region, 
       COUNT(*) AS cnt,
       MAX(amount) AS max_amt,
       AVG(amount) AS avg_amt
FROM sales
GROUP BY region
HAVING COUNT(*) > 3
   AND MAX(amount) > 1000
   AND AVG(amount) IS NOT NULL;

六、HAVING + 表达式

-- 销售额标准差 > 500 的地区(需计算方差)
SELECT region,
       STDDEV(amount) AS sales_std
FROM sales
GROUP BY region
HAVING STDDEV(amount) > 500;
-- 销售额占总比 > 40% 的品类
WITH total AS (SELECT SUM(amount) AS grand_total FROM sales)
SELECT category, SUM(amount) AS cat_total
FROM sales, total
GROUP BY category
HAVING SUM(amount) > 0.4 * total.grand_total;

七、HAVING + GROUPING SETS / ROLLUP / CUBE

-- 包含小计,但只显示订单数 > 1 的组
SELECT region, category, COUNT(*)
FROM sales
GROUP BY ROLLUP(region, category)
HAVING COUNT(*) > 1;

八、HAVING + 窗口函数?不行!

-- 错误!HAVING 不能直接用窗口函数
SELECT region, SUM(amount),
       RANK() OVER (PARTITION BY region ORDER BY SUM(amount) DESC)
FROM sales
GROUP BY region
HAVING RANK() = 1;  -- 语法错误

正确做法:用 CTE 或子查询

WITH ranked AS (
    SELECT region, SUM(amount) AS total,
           RANK() OVER (ORDER BY SUM(amount) DESC) AS rnk
    FROM sales
    GROUP BY region
)
SELECT region, total
FROM ranked
WHERE rnk = 1;

九、性能优化:HAVING 不影响索引

好消息HAVING 是在分组后过滤,不影响 WHERE 索引
建议优先用 WHERE 过滤原始行,减少分组开销
-- 推荐:先 WHERE 过滤,再 GROUP + HAVING
SELECT region, SUM(amount)
FROM sales
WHERE sale_date >= '2025-01-01'
GROUP BY region
HAVING SUM(amount) > 1000;

-- 建索引加速 WHERE
CREATE INDEX idx_sales_date ON sales(sale_date);

十、常见陷阱与解决方案

陷阱说明解决方案
WHERE COUNT(*) > 5语法错误改用 HAVING
HAVING region = 'North'GROUP BY必须加 GROUP BY region
HAVING amount > 1000amount 不是聚合列MAX(amount) 或移到 WHERE
HAVING 过滤太晚全部分组后才过滤提前用 WHERE 减少行数
-- 错误
SELECT region, amount
FROM sales
GROUP BY region
HAVING amount > 1000;

-- 正确
SELECT region, MAX(amount)
FROM sales
WHERE amount > 1000
GROUP BY region;

十一、最佳实践 Checklist

项目建议
HAVING 必须配 GROUP BY避免语法错误
优先用 WHERE 过滤原始行提升性能
HAVING 只用聚合函数或分组列符合规则
复杂过滤用 CTE清晰 + 可复用
WHERE 索引加速过滤
EXPLAIN 验证计划确保无全表扫描

十二、一键销售分析(生产级)

-- 筛选:订单数 > 2,平均订单 > 500,最大单笔 > 1000 的地区
WITH summary AS (
    SELECT 
        region,
        COUNT(*) AS order_count,
        SUM(amount) AS total_sales,
        AVG(amount) AS avg_order,
        MAX(amount) AS max_order
    FROM sales
    WHERE sale_date >= '2025-01-01'
    GROUP BY region
)
SELECT *
FROM summary
WHERE order_count > 2
  AND avg_order > 500
  AND max_order > 1000;

十三、快速实战:5 分钟掌握 HAVING

-- 1. 订单数 > 3 的地区
SELECT region, COUNT(*)
FROM sales
GROUP BY region
HAVING COUNT(*) > 3;

-- 2. 总销售额 > 2000 的品类
SELECT category, SUM(amount)
FROM sales
GROUP BY category
HAVING SUM(amount) > 2000;

-- 3. 平均订单 > 100 且有 Electronics 的地区
SELECT region, AVG(amount), STRING_AGG(category, ',')
FROM sales
GROUP BY region
HAVING AVG(amount) > 100
   AND MAX(category) = 'Electronics';

-- 4. 优化:先 WHERE 再 HAVING
SELECT region, SUM(amount)
FROM sales
WHERE amount > 50
GROUP BY region
HAVING COUNT(*) > 1;

现在就动手
sales 表中:

  1. 统计 每个 region订单数总金额,筛选 订单数 > 2 的地区
  2. 筛选 平均订单 > 1000category
  3. 用 CTE 实现:订单数 > 1 且最大单笔 > 1000region + category 组合
  4. WHERE 索引,验证 EXPLAIN 性能提升

十四、性能对比(EXPLAIN)

-- 慢:全表分组
EXPLAIN ANALYZE 
SELECT region, COUNT(*) FROM sales GROUP BY region HAVING COUNT(*) > 1;

-- 快:先过滤
EXPLAIN ANALYZE 
SELECT region, COUNT(*) FROM sales 
WHERE amount > 100 
GROUP BY region 
HAVING COUNT(*) > 1;

需要我生成:

  • 10万条销售数据 + 复杂 HAVING 分析?回复 测试数据
  • 动态 HAVING 条件(Python)?回复 Python HAVING
  • Top-N per Group + HAVING?回复 Top-N
  • EXPLAIN 可视化分析?回复 EXPLAIN 分析

随时告诉我!

类似文章

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注