Ptyhon爬虫实战(七):爬取汽车公告网上的批次排量等信息

最新推荐文章于 2023-10-19 19:30:00 发布

悦来客栈的老板

最新推荐文章于 2023-10-19 19:30:00 发布

阅读量3w

点赞数 2

分类专栏： Python，爬虫文章标签： Python爬虫

本文链接：https://blog.csdn.net/qq523176585/article/details/77893373

版权

Python，爬虫专栏收录该内容

57 篇文章 19 订阅

订阅专栏

网址：http://www.cn357.com/notice/

直接上代码。

#coding=utf-8
import re
import requests

def getHtml(url):
    try:
        page = requests.get(url)
        html = page.text
        return html
    except:
        print ("网页访问异常")
        return ""

def getInfo(html):
    reg = re.compile(r".*?公告批次：(.*?)，生产企业：.*?识别代号：(.*?)，轴数：.*?发动机排量：(.*?)，发动机功率：",re.S)
    items = re.findall(reg,html)
    for item in items:
        print("\t".join(list(map(str.strip,item))))


if __name__=='__main__':
    for i in range(1,61):
        url = "http://www.cn357.com/cvi.php?m=cvinotice&search=n&brand=%B1%F0%BF%CB&page=" + str(i)
        html = getHtml(url)
        getInfo(html)