如何彻底避免 HTTP 429 Too Many Requests（Python 实战版·2025最新）

429 是爬虫/接口调用中最常见的“死亡宣告”，意思是：你太快了，被服务器限流了。
下面给你一套 从入门到生产级 的完整防429方案，99.9%的场景都能搞定！

一、429 的常见触发原因（先知道敌人是谁）

触发场景	典型阈值（大致）	代表网站
每秒请求数（RPS）	5~50	微博、知乎、豆瓣
每分钟请求数	60~1000	百度、京东、淘宝
短时间内相同IP请求	100次/分钟	绝大多数网站
没有 User-Agent	立刻429或直接封IP	几乎所有现代网站
缺少必要 Cookie	触发反爬	抖音、B站、微信公众号
频率突然暴增	触发风控	所有大厂

二、Python 防429 终极解决方案（8大招式，任选组合）

等级	方法	代码示例（直接复制）
★☆☆	1. 基础 sleep	“`python
import time
time.sleep(1) # 每请求一次睡1秒，最简单粗暴
“`	小网站、学习用
★★☆	2. 随机 sleep（推荐）	“`python
import time, random
time.sleep(random.uniform(0.5, 2.5)) # 0.5~2.5秒随机
“`	90%爬虫都能过
★★☆	3. requests + retry（优雅）	“`python
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

s = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504])
s.mount(‘http://’, HTTPAdapter(max_retries=retries))
s.mount(‘https://’, HTTPAdapter(max_retries=retries))
| 接口调用、轻量爬虫 | | ★★★ | 4. 限制并发（多线程/异步必备） | **线程池**<br>python
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(max_workers=5) as pool: # 最多5个并发
pool.map(crawl, urls)
<br>**异步**<br>python
import asyncio, aiohttp
sem = asyncio.Semaphore(10) # 最多10个并发
async def fetch(session, url):
async with sem:
async with session.get(url) as resp:
return await resp.text()
| 大规模爬取 | | ★★★ | 5. 动态调整频率（智能） |python
import time
class SmartSleep:
def init(self):
self.last_time = 0
self.min_interval = 0.2
def sleep(self):
now = time.time()
diff = now – self.last_time
if diff < self.min_interval:
time.sleep(self.min_interval – diff)
self.last_time = time.time()
ss = SmartSleep()

用法：每次请求后 ss.sleep()

| 高强度爬虫 | | ★★★ | 6. 分布式 + IP代理池（终极） |python
import random
proxies = [
“http://1.2.3.4:8888”,
“http://5.6.7.8:9999”,
# 1000个代理…
]
proxy = random.choice(proxies)
requests.get(url, proxies={“http”: proxy, “https”: proxy})
| 封不死的爬虫（收费代理推荐：芝麻、讯代理、瞬火） | | ★★☆ | 7. 伪装完美请求头 |python
headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36”,
“Accept”: “text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8″,
“Accept-Language”: “zh-CN,zh;q=0.9,en;q=0.8”,
“Accept-Encoding”: “gzip, deflate, br”,
“Connection”: “keep-alive”,
“Upgrade-Insecure-Requests”: “1”,
}
| 所有网站必备 | | ★★☆ | 8. 尊重 robots.txt + 礼貌爬取 |python
from urllib.robotparser import RobotFileParser
rp = RobotFileParser()
rp.set_url(“https://xxx.com/robots.txt”)
rp.read()
if rp.can_fetch(“*”, url):
# 允许才爬
“` | 良心爬虫、避免被拉黑 |

三、最强组合（生产级防429方案）

import requests
import time
import random
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

class Anti429Session:
    def __init__(self):
        self.session = requests.Session()
        # 自动重试429
        retry = Retry(total=10, backoff_factor=2, status_forcelist=[429, 500, 502, 503, 504])
        adapter = HTTPAdapter(max_retries=retry)
        self.session.mount("http://", adapter)
        self.session.mount("https://", adapter)

        # 随机头
        self.session.headers.update({
            "User-Agent": random.choice([
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/129",
                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
                # 多加几个
            ])
        })

    def get(self, url, **kwargs):
        time.sleep(random.uniform(0.3, 1.2))  # 核心：随机延迟
        try:
            resp = self.session.get(url, timeout=10, **kwargs)
            if resp.status_code == 429:
                wait = int(resp.headers.get("Retry-After", 10))
                print(f"429触发，等待{wait}秒")
                time.sleep(wait)
                return self.get(url, **kwargs)  # 再试一次
            return resp
        except Exception as e:
            time.sleep(5)
            return self.get(url, **kwargs)

# 使用
s = Anti429Session()
r = s.get("https://httpbin.org/status/429")

四、2025年终极建议（一句话记住）

“慢即是快，伪装即是生存”

永远加随机 sleep
永远用真实浏览器头
永远控制并发 < 10
永远准备代理池备用
永远尊重 Retry-After 头

遵守这五条，你这辈子都不会再被429折磨！

需要我给你一个完整可运行的防429爬虫框架（支持异步+代理池+自动重试+失败重爬），直接说一声，我5分钟发你！

如何避免HTTP错误429(请求太多)Python

如何彻底避免 HTTP 429 Too Many Requests（Python 实战版·2025最新）

一、429 的常见触发原因（先知道敌人是谁）

二、Python 防429 终极解决方案（8大招式，任选组合）

用法：每次请求后 ss.sleep()

三、最强组合（生产级防429方案）

四、2025年终极建议（一句话记住）

likuolei

发表回复取消回复

归档

分类

2026 年 3 月
一	二	三	四	五	六	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

如何彻底避免 HTTP 429 Too Many Requests（Python 实战版·2025最新）

一、429 的常见触发原因（先知道敌人是谁）

二、Python 防429 终极解决方案（8大招式，任选组合）

用法：每次请求后 ss.sleep()

三、最强组合（生产级防429方案）

四、2025年终极建议（一句话记住）

likuolei

发表回复 取消回复

相关文章

发表回复取消回复