一、前言
最近在家隔离,每天起床第一件事就是看看各地情况,好在目前情况有了些许好转。天佑武汉!天佑种花家!
偶然间在技术贴看到一数据入口,点开一看,如获至宝,爬之分析之!
数据以json的模样展现在我们面前,各省入口是省名称。(因为一些原因不能在博客上贴出数据,可视化上也将数据屏蔽,毕竟实现方法是最重要的,见谅!)
url = "https://lab.isaaclin.cn/nCoV/api/area?latest=0&province={0}".format(province)
想要获取全国的数据只要遍历省份进行请求解析就好了。各省的数据我只截取到1月24号(除夕)。
现在开始请求
二、爬虫
1. 请求
def spider_virus(url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
}
html = requests.get(url=url,headers=headers)
html.encoding = "utf-8"
html.raise_for_status()
只要挂个UA就能请求通~
推荐给大伙一个UA包,很好用,不用每次都去粘UA,省了很多事
from fake_useragent import UserAgent
只要"User-Agent": UserAgent().random,
就可以了,方便得很。
2. Json解析
解析json这里就很简单了
# 省级
provinceName = results['provinceName']
provinceShortName = results['provinceShortName']+"全部统计"
confirmedCount = results['confirmedCount']
suspectedCount = results['suspectedCount']
curedCount = results['curedCount']
deadCount = results['deadCount']
# locationId = results['locationId']
# 13位时间戳转换
updateTimenum = results['updateTime']
timeStamp = float(int(updateTimenum) / 1000)
timeArray = time.localtime(timeStamp)
# otherStyleTime = time.strftime("%Y-%m-%d %H:%M:%S", timeArray)
updateTime = time.strftime("%Y-%m-%d %H:%M:%S", timeArray)
province_list.append((provinceName, provinceShortName, confirmedCount, suspectedCount,
curedCount, deadCount,updateTime))
# 市级
try:
cities = results['cities']
for city_data in cities:
# print(city_data)
cityName = city_data['cityName']
confirmedCount = city_data['confirmedCount'] # 确诊人数
suspectedCount = city_data['suspectedCount'] # 疑似感染人数
curedCount = city_data['curedCount'] # 治愈人数
deadCount = city_data['deadCount'] # 死亡人数
# locationId = city_data['locationId']
province_list.append((provinceName, cityName, confirmedCount, suspectedCount,
curedCount, deadCount, updateTime))
except Exception as e :
cities = "未知"
cityName = "未知"
confirmedCount = 0000
suspectedCount = 0000
curedCount = 0000
deadCount = 0000
province_list.append((provinceName, cityName, confirmedCount, suspectedCount,
curedCount, deadCount, updateTime))
To_MySQL("Virus", province_list)
3. 数据存储(Mysql)
最后的数据我存到了mysql里,建表也是在python里写好的,就不贴了
def Insert_Data(datas_into, table_name):
try:
# print(datas_into)
db = pymysql.connect(host='手动打码', user='root', password='*******', db='liu*手动打码', charset='utf8')
cursor = db.cursor()
# for data_tups in list(datas_into):
sql_insert = """insert into {0} (Province,city,confirmedCount,suspectedCount,curedCount,deadCount,updataTime) values (%s, %s, %s, %s, %s, %s, %s);""".format(table_name)
cursor.executemany(s