什么是爬虫？——从技术原理到现实应用的全面解析 III

最新推荐文章于 2025-04-23 21:11:07 发布

conkl

最新推荐文章于 2025-04-23 21:11:07 发布

阅读量495

点赞数 13

分类专栏： python知识文章标签：爬虫 python

本文链接：https://blog.csdn.net/hanhanduizhang/article/details/147408419

版权

python知识专栏收录该内容

23 篇文章 ¥19.90 ¥99.00

订阅专栏

超级会员免费看

十、异步IO与高性能爬虫架构

10.1 基于aiohttp的异步爬虫

import aiohttp
import asyncio
from bs4 import BeautifulSoup

async def fetch(session, url):
    try:
        async with session.get(url, timeout=10) as response:
            if response.status == 200:
                return await response.text()
            return None
    except Exception as e:
        print(f"请求失败: {str(e)}")
        return None

async def parse_product(url):
    async with aiohttp.ClientSession(
        headers={'User-Agent': 'Mozilla/5.0'}
    ) as session:
        html = await fetch(session, url)
        if html:
            soup = BeautifulSoup(html, 'lxml')
            r

了解本专栏