使用爬虫抓取网站异步加载数据

  • 什么是异步加载?
    向网站进行一次请求,一次只传部分数据。如:有些网页不需要点击下一页,其内容也可以源源不断地加载。
  • 如何发现异步加载?
    1、打开浏览器,右键选择“检查”
    2、点击“Network”、“XHR”
    这样在网页进行不断下拉的过程中,显示器会记录全部动作。可以看到不断加载新的页。
  • 如何加载异步数据?
    具体例子:
from bs4 import BeautifulSoup
import requests
import time

url = 'https://knewone.com/discover?page='

def get_page(url,data=None):

    wb_data = requests.get(url)
    soup = BeautifulSoup(wb_data.text,'lxml')
    imgs = soup.select('a.cover-inner > img')
    titles = soup.select('section.content > h4 > a')
    links = soup.select('section.content > h4 > a')

    if data==None:
        for img,title,link in zip(imgs,titles,links):
            data = {
                'img':img.get('src'),
                'title':title.get('title'),
                'link':link.get('href')
            }
            print(data)


def get_more_pages(start,end):
    for one in range(start,end):
        get_page(url+str(one))
        time.sleep(2)


get_more_pages(1,3)

输出:

{'title': '魔浪 H3 三防户外蓝牙音箱', 'link': '/things/mo-lang-h3-san-fang-hu-wai-lan-ya-yin-xiang', 'img': 'https://making-photos.b0.upaiyun.com/photos/64740e1b5cf6c6af4126675cfe7b1e70.jpg!thing.fixed.big'}
{'title': '舒尔 MV88 电容迷你麦克风', 'link': '/things/shu-er-mv88-dian-rong-mi-ni-mai-ke-feng', 'img': 'https://making-photos.b0.upaiyun.com/photos/d4902fe5d63d3a38b189559e328842ed.jpg!thing.fixed.big'}
{'title': 'Minolta SRT303 胶片单反', 'link': '/things/minolta-srt303-xiao-pian-dan-fan', 'img': 'https://making-photos.b0.upaiyun.com/photos/2ed7f4414b4cd3930f4779415291939a.jpg!thing.fixed.big'}
{'title': '纪念碑谷官方周边', 'link': '/things/ji-nian-bei-gu-guan-fang-zhou-bian', 'img': 'https://making-photos.b0.upaiyun.com/photos/a5fa2393899557c3589df39349bdfbf4.jpg!thing.fixed.big'}
{'title': '卡菲单反无线取景器', 'link': '/things/qia-fei-dan-fan-wu-xian-qu-jing-qi', 'img': 'https://making-photos.b0.upaiyun.com/photos/db6d5f0c461ea2aa92c27a60104505c6.png!thing.fixed.big'}
{'title': 'naim mu-so 无线音箱', 'link': '/things/naim-mu-so-wu-xian-yin-xiang', 'img': 'https://making-photos.b0.upaiyun.com/photos/07b35556a4c7884f21c7ee9f0ba067fc.jpg!thing.fixed.big'}
{'title': 'UA MICRO G LIMITLESS TR 训练鞋', 'link': '/things/ua-micro-g-limitless-tr-xun-lian-xie', 'img': 'https://making-photos.b0.upaiyun.com/photos/64764f29fb9f8572e31c178f9359fde1.jpg!thing.fixed.big'}
{'title': 'Gecco 血源诅咒限量版手办', 'link': '/things/gecco-xie-yuan-zu-zhou-xian-liang-ban-shou-ban', 'img': 'https://making-photos.b0.upaiyun.com/photos/cdfac36b0f73fb153899c7ccba98a9e1.jpg!thing.fixed.big'}
{'title': 'BISSELL Bolt 充电两用吸尘器', 'link': '/things/bissell-bolt-chong-dian-liang-yong-xi-chen-qi', 'img': 'https://making-photos.b0.upaiyun.com/photos/02f6278b92b152aa1d562bee3956e8ff.jpg!thing.fixed.big'}
{'title': 'Lululemon Lightspeed Run Hat 男士鸭舌帽', 'link': '/things/lululemon-lightspeed-run-hat-nan-shi-ya-she-mao', 'img': 'https://making-photos.b0.upaiyun.com/photos/9e411524ab5a3a130dc62e6e8b18ffa0.jpg!thing.fixed.big'}
{'title': 'LEGO 8420 Street Bike', 'link': '/things/lego-8420-street-bike', 'img': 'https://making-photos.b0.upaiyun.com/photos/0a59d90d3745996f6b63d99fb4dfd5da.jpg!thing.fixed.big'}
{'title': 'FURYU 超级索尼子 SONICO', 'link': '/things/furyu-chao-ji-suo-ni-zi-sonico', 'img': 'https://making-photos.b0.upaiyun.com/photos/465bed39726424d4d3c2b5bb1a22397f.jpg!thing.fixed.big'}
{'title': 'MUJI 万年笔', 'link': '/things/muji-mo-nian-bi', 'img': 'https://making-photos.b0.upaiyun.com/photos/4678c7245fa1959963266783f271d402.jpg!thing.fixed.big'}
{'title': 'Kaweco special 0.7 机械铅笔', 'link': '/things/kaweco-special-0-dot-7-ji-jie-qian-bi', 'img': 'https://making-photos.b0.upaiyun.com/photos/a28fdc529384d8a84f41d0a41c022c7e.jpg!thing.fixed.big'}
{'title': 'SYM 雪松耳夹', 'link': '/things/sym-xue-song-er-jia', 'img': 'https://making-photos.b0.upaiyun.com/photos/95b25cda3fd35819d5a6d81dd7b0f8d7.jpg!thing.fixed.big'}
{'title': 'Nathome USB 桌面風扇', 'link': '/things/nathome-usb-zhuo-mian-feng-shan', 'img': 'https://making-photos.b0.upaiyun.com/photos/4c46ef948c7a62ecff2dd46f4264fc50.jpg!thing.fixed.big'}
{'title': 'SONY ICF-CS15iPN 音响', 'link': '/things/sony-icf-cs15ipn-yin-xiang', 'img': 'https://making-photos.b0.upaiyun.com/photos/2120347fbf7bc41c82b14b44570b6d88.jpg!thing.fixed.big'}
{'title': 'Helios 58mm f2 44m-7', 'link': '/things/helios-58mm-f2-44m-7', 'img': 'https://making-photos.b0.upaiyun.com/photos/cc065fa6998f9e40127c983a59d35455.jpg!thing.fixed.big'}
{'title': 'Canon Eos 55 胶片单反相机', 'link': '/things/canon-eos-55-xiao-pian-dan-fan-xiang-ji', 'img': 'https://making-photos.b0.upaiyun.com/photos/7e5b2dcec43e8a6c39746c87d2d758bd.jpg!thing.fixed.big'}
{'title': 'ELECOM SD 卡收纳盒', 'link': '/things/elecom-sd-qia-shou-na-he', 'img': 'https://making-photos.b0.upaiyun.com/photos/1d6efcf81cb33016ae80ade663ca36c0.jpg!thing.fixed.big'}
{'title': '三文堂 ECO 03017 钢笔', 'link': '/things/san-wen-tang-eco-03017-gang-bi', 'img': 'https://making-photos.b0.upaiyun.com/photos/e6e4d451259b91099ff8bb9bd86bc3d1.jpg!thing.fixed.big'}
{'title': 'Morphy richards 便携式榨汁机', 'link': '/things/morphy-richards-bian-xi-shi-zha-zhi-ji', 'img': 'https://making-photos.b0.upaiyun.com/photos/f376e9973e752fe49741e8562299379d.jpg!thing.fixed.big'}
{'title': 'Coach 零钱包', 'link': '/things/coach-ling-qian-bao', 'img': 'https://making-photos.b0.upaiyun.com/photos/8f03f6bce5e812303bdb883a2293a27e.jpg!thing.fixed.big'}
{'title': 'FUJIFILM XF 23mm 1.4R 镜头', 'link': '/things/fujifilm-xf-23mm-1-dot-4r-jing-tou', 'img': 'https://making-photos.b0.upaiyun.com/photos/008942403c600856a4614481cec31c05.jpg!thing.fixed.big'}
  • 4
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值