思路:
1.获取数据
2.分析数据
3.取需要的数据
4.表格化表达(美化)
要引入三个模板(安装)
在终端输入安装命令: pip install 模块名 requests --> pip install requests prettytable --> pip install prettytable DrissionPage --> pip install DrissionPage
一、获取数据
import requests
url='xxx'
headers = {
'Cookie':'uab_collina=171705490611934497367061; JSESSIONID=5E739BA6481B4A4FDF3E8F0FE11BAFE2; guidesStatus=off; highContrastMode=defaltMode; cursorStatus=off; _jc_save_fromStation=%u5317%u4EAC%2CBJP; _jc_save_toStation=%u4E0A%u6D77%2CSHH; _jc_save_wfdc_flag=dc; _jc_save_toDate=2024-06-01; BIGipServerotn=4023845130.50210.0000; BIGipServerpassport=971505930.50215.0000; route=9036359bb8a8a461c164a04f8f50b252; _jc_save_fromDate=2024-06-02',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36'
}
response=requests.get(url=url,headers=headers)
如何找url等:
二、数据分析+取需要数据
步骤1.知道有什么
#确定需要信息的位置1未切片,['data']['result']是print(response.json())结果得出
print(response.json()['data']['result'])
#确定需要信息的位置2
for index in response.json()['data']['result']
page = 0
for i in index.split('|'):
print(i,page,sep='|')
page+=1
break
步骤2.找出需要的
eg.找一等座
找到有具体票数的(好对应)
从12306网页知道12是一等座
要从切片数据中找到它
已知D9为倒数第3行
for index in response.json()['data']['result'][-3:]:
page = 0
for i in index.split('|'):
print(i,page,sep='|')
page+=1
break
得到12对应的下标为23,即一等座下标
所有需要信息字典表示
同理可得所有数据对应的
用字典打印相关信息(检查)
for index in response.json()['data']['result']:
info=index.split('|')
num = info[3]
start_time = info[8] # 出发时间
end_time = info[9] # 到达时间
use_time = info[10] # 耗时
topGrade = info[32] # 特等座
first_class = info[31] # 一等
second_class = info[30] # 二等
hard_sleeper = info[28] # 硬卧
hard_seat = info[29] # 硬座
no_seat = info[26] # 无座
soft_sleeper = info[23] # 软卧
dic={
'车次':num,
'出发时间':start_time,
'到达时间':end_time,
'耗时':use_time,
'特等座':topGrade,
'一等':first_class,
'二等':second_class,
'软卧':soft_sleeper,
'硬卧':hard_sleeper,
'硬座':hard_seat,
'无座':no_seat,
}
print(dic)
三、表格化表达(美化)
import prettytable as pt
#表格模式
tb=pt.PrettyTable()#实例化对象
tb.field_names=[
'序号',
'车次',
'出发时间',
'到达时间',
'耗时',
'特等座',
'一等',
'二等',
'软卧',
'硬卧',
'硬座',
'无座',
]
page=1
for index in response.json()['data']['result']:
info=index.split('|')
num = info[3]
start_time = info[8] # 出发时间
end_time = info[9] # 到达时间
use_time = info[10] # 耗时
topGrade = info[32] # 特等座
first_class = info[31] # 一等
second_class = info[30] # 二等
hard_sleeper = info[28] # 硬卧
hard_seat = info[29] # 硬座
no_seat = info[26] # 无座
soft_sleeper = info[23] # 软卧
tb.add_row([page,
num,
start_time,
end_time,
use_time,
topGrade,
first_class,
second_class,
soft_sleeper,
hard_sleeper,
hard_seat,
no_seat,
])
page+=1
print(tb)
四、真正查询功能实现
之前url引用的是:某天,某地点到某地点的url
url改变为自定义
import json
f = open('city.json', encoding='utf-8')
txt = f.read()
# json字符串转成json字典
json_data = json.loads(txt)
from_station = input('请输入出发城市:')
to_station = input('请输入目的城市:')
date=input('请输入出发日期(2024-06-01):')
print(json_data[from_station])
print(json_data[to_station])
url=f'https://kyfw.12306.cn/otn/leftTicket/query?leftTicketDTO.train_date={date}&leftTicketDTO.from_station={json_data[from_station]}&leftTicketDTO.to_station={json_data[to_station]}&purpose_codes=ADULT'