【python爬虫实战：爬虫壁纸网站数据】

Grail Lee

已于 2022-04-08 22:45:33 修改

阅读量407

点赞数

分类专栏：爬虫实例文章标签： python

于 2022-04-08 16:24:19 首次发布

本文链接：https://blog.csdn.net/weixin_44240659/article/details/124044152

版权

爬虫实例专栏收录该内容

1 篇文章 0 订阅

订阅专栏

scrapy定向web描述信息

准备材料：

1.requests库、bs4库、lxml库

2.pycharm等ide环境

执行步骤：

1.导入库

from bs4 import BeautifulSoup
import requests

if __name__ == '__main__':
    target = "https://www.xxxx.com/"

2.利用requests库get方法获取网页数据

r = requests.get(url=target) # 获取网页内容

3.创建bs4对象，用lxml解析

bs = BeautifulSoup(r.text, 'lxml') # 创建BeautifulSoup对象

img_url = []
title_list = []

4.找到class="title"的a标签，定位title及img_url属性

 #找到class="title"的a标签
    for a in bs.find_all('a', class_='title'):
        #获取a标签的title属性
        title_list.append(a.get('title'))
        #找到a标签中的href标签
        img_url.append(a.get('href'))
        #输出title和href
        print(a.get('title'), a.get('href'))