python爬取苏州天气并用excel来保存

最新推荐文章于 2023-04-11 20:52:24 发布

小琪爷

最新推荐文章于 2023-04-11 20:52:24 发布

阅读量876

点赞数 3

文章标签： python

本文链接：https://blog.csdn.net/qq_52154193/article/details/113664420

版权

用python爬取苏州天气

python爬取苏州天气

一、爬虫是什么？

爬虫我理解就是有针对性的爬取网络上的资源！比如说浏览器本身就是一种爬虫工具，当你访问某种网页时，你的浏览器就像一个蜘蛛，爬取网页的内容。

二、使用步骤1

1.引入库1

import xlwt
import re
import requests
from bs4 import BeautifulSoup

2.爬取苏州天气+数据通过Excel保存

import xlwt
import requests
import re
from bs4 import BeautifulSoup
def main():
    url = 'http://www.weather.com.cn/weather/101190401.shtml'
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36 Edg/88.0.705.56'
    }
    html = requests.get(url,headers=headers)
    html = BeautifulSoup(html.content,'lxml')
    body = html.find('body')
    data = body.find('div',{'id':'7d'})
    ul = data.find('ul')
    lis = ul.find_all('li')
    final_list = []
    for day in lis:
        temp_list = []
        
        date = day.find('h1').string             #找到日期
        temp_list.append(date)     
    
        info = day.find_all('p')                 #找到所有的p标签
        temp_list.append(info[0].string)
    
        if info[1].find('span') is None:          #找到p标签中的第二个值'span'标签——最高温度
            temperature_highest = ' '             #用一个判断是否有最高温度
        else:
            temperature_highest = info[1].find('span').string
            temperature_highest = temperature_highest.replace('℃',' ')
            
        if info[1].find('i') is None:              #找到p标签中的第二个值'i'标签——最高温度
            temperature_lowest = ' '               #用一个判断是否有最低温度
        else:
            temperature_lowest = info[1].find('i').string
            temperature_lowest = temperature_lowest.replace('℃',' ')
            
        temp_list.append(temperature_highest)       #将最高气温添加到temp_list中
        temp_list.append(temperature_lowest)        #将最低气温添加到temp_list中
    
        wind_scale = info[2].find('i').string      #找到p标签的第三个值'i'标签——风级，添加到temp_list中
        temp_list.append(wind_scale)
    
        final_list.append(temp_list)              #将temp_list列表添加到final_list列表中
    
    datalist = final_list
    saveDate(datalist)
def saveDate(datalist):
    workbook = xlwt.Workbook(encoding='utf-8')
    worksheet = workbook.add_sheet('sheet1')
    col = ('日期','天气','最高温度','最低温度','风级范围')
    for i in range(0,len(col)):
        worksheet.write(0,i,col[i])
    for j in range(0,len(datalist)):
        data = datalist[j]
        for k in range(0,len(data)):
            worksheet.write(j+1,k,data[k])
    workbook.save("苏州天气.xls")
if __name__ == "__main__":
    main()
    print('爬取完毕！')

总结

上面代码的先后顺序就是
1.先定义一个主函数main()
2.再就是访问网页需要包装(毕竟你用的不是浏览器访问的🐎)
3.然后再就是运用正则表达式来爬取网页啦
4.最后的savedata()函数就是用来保存数据，通过excel来保存啦

。

小琪爷

关注

3
点赞
踩
11

收藏

觉得还不错? 一键收藏
8
评论
python爬取苏州天气并用excel来保存

用python爬取苏州天气python爬取苏州天气一、爬虫是什么？爬虫我理解就是有针对性的爬取网络上的资源！比如说浏览器本身就是一种爬虫工具，当你访问某种网页时，你的浏览器就像一个蜘蛛，爬取网页的内容。二、使用步骤11.引入库1import xlwtimport reimport requestsfrom bs4 import BeautifulSoup2.爬取苏州天气+数据通过Excel保存import xlwtimport requestsimport refrom bs4
复制链接

扫一扫