先确定总体思路:
1.确定爬取对象:
2.导入包(bs4)
3.request,bs4beautiful soup获取网页数据
4.matplotlib绘制折线图
具体实现:
1.确定 爬取 对象:
历史天气网做爬取对象:lishi.tianqi.com/luoyang
2.导入包
import requests
from bs4 import BeautifulSoup
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import csv
-url伪装
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36 Edg/99.0.1150.46'
}l.parser')
-w形式打开文件(Tianqi.csv)
with open('Tianqi.csv','w',newline='') as file:
write=csv.writer(file)
write.writerow(['日期','星期','最高','最低','天气','风向','风力'])
with open('Tianqi.csv','w',newline='') as file:
max = []
lim = []
w=csv.writer(file)
3.request,bs4beautiful soup获取网页数据
url=f'http://lishi.tianqi.com/luoyang/2021{month}.html'
response=requests.get(url=url,headers=headers)
text=response.text
-用bs4解析数据
soup=BeautifulSoup(text,'lxml')
li_list=soup.select('.thrui > li')
-写入数据
for j in range(len(li_list)):
a = li_list[j].text
info_list = a.split()
max.append(int(info_list[2].replace('℃', '')))
lim.append(int(info_list[3].replace('℃', '')))
w.writerow(info_list)
4.matplotlib绘制折线图
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
x=pd.date_range('20210101',periods=365)
plt.plot(x, temp_high, 'r-', label='最高气温')
plt.plot(x, temp_low, 'b-', label='最低气温')
plt.legend()
plt.xlabel('日期')
plt.ylabel('气温(单位:℃)')
#给所画的图像进行命名
plt.savefig('./lishitianqi.png')
搞定!