python爬虫抢火车票_如何用python写一个简单的12306抢票软件|python 爬火车票教程...

最新推荐文章于 2024-07-14 17:20:08 发布

weixin_39653320

最新推荐文章于 2024-07-14 17:20:08 发布

阅读量473

点赞数

文章标签： python爬虫抢火车票

python 如果抓取验证码图片类似12306的登录验证码图片

这个以前做次。最大的麻烦是码的识别算法的识别率太低。12306那种网站登陆错3次就限制你20分钟。所以除非你有33%以上的识别率否则不要尝试了。

通常做法是另存验证码图片，通常收集几十个，然后训练自己的识别算法。我曾经用PIL库自己做过识别算法，最高只有10%的识别率。效率还可以，一秒可以识别10次左右。主要是图片很小，所以处理起来也快。

验证码识别还有多少公开的算法，只能用来参考。不过真正识别都需要自己根据实际情况去训练改进算法。

如何用python写一个简单的12306抢票软件

#coding=utf-8

from selenium import webdriver

from time import sleep

import traceback

TICKET_URI = 'https://kyfw.12306.cn/otn/leftTicket/init'

LOGIN_URI = 'https://kyfw.12306.cn/otn/login/init'

MY_URI = 'https://kyfw.12306.cn/otn/index/initMy12306'

from splinter.browser import Browser

from time import sleep

import traceback

TICKET_URI = 'https://kyfw.12306.cn/otn/leftTicket/init'

LOGIN_URI = 'https://kyfw.12306.cn/otn/login/init'

MY_URI = 'https://kyfw.12306.cn/otn/index/initMy12306'

def login():

brw.find_element_by_id(LOGIN).click()

sleep(3)

uname = '123456789qq.com'

pwd = 'xxxyyyzzz'

brw.find_element_by_id('username').send_keys(uname)

sleep(1)

brw.find_element_by_id('password').send_keys(pwd)

sleep(1)

while True:

if brw.current_url = MY_URI:

sleep(1)

else:

break

def addCookie(cklist):

li = list()

for d in cklist:

if d['name'] == '_jc_save_toStation' or d['name'] == '_jc_save_toDate' or d['name'] == '_jc_save_fromStation':

li.append(d)

return li

def book():

global brw

brw = webdriver.Chrome()

brw.set_window_size(1366, 768)

brw.get(TICKET_URI)

sleep(3)

while brw.find_element_by_id(LOGIN):

if brw.current_url == MY_URI:

break;

try:

brw.get(TICKET_URI)

sleep(2)

# set src

brw.find_element_by_id('fromStationText').clear()

brw.find_element_by_id('fromStationText').click()

brw.find_element_by_id('fromStationText').send_keys(u'合肥南')

sleep(3)

# set dst

brw.find_element_by_id('toStationText').clear()

brw.find_element_by_id('toStationText').click()

brw.find_element_by_id('toStationText').send_keys(u'')

sleep(3)

# set left date

print('please click train date')

sleep(5)

cke = brw.get_cookies()

li = addCookie(cke)

for x in li:

brw.add_cookie(x)

brw.refresh()

count = 0

success = False

if not success:

while brw.current_url == TICKET_URI:

brw.find_element_by_id('query_ticket').click()

sleep(2)

print(u'第%d新' % count)

count = 1

brw.find_element_by_partial_link_text('D3057')

except Exception as e:

print(traceback.print_exc())

if __name__ == "__main__":

book()

我写了个python读取12306网页的脚本本地运行良好，一旦提交到GAE就报错

在python中，一般我用urllib的urlopen来打开URL并抓取网页内容或者服务器的数据！

但是在GAE中不能这样做，否则会报“访拒绝”字样的错误，主要原因是python中的urlopen使用了socket来连接，GAE处于安全和效率等方面的考虑，禁止使用urlopen,而以 urlfetch替代之，后者则是基于HTTP连接的！

你可以参考下面这个对代码进行修改：from google.appengine.api import urlfetch

# ... ...

url = "http://www.python.org"

result = urlfetch.fetch(url)

if result.status_code == 200:

doc = result.content

do_something(doc)

如果解决了您的问题请采纳！

如果未解决请继续追问！

如何用python写一个简单的12306抢票软件

直接用流燕抢票软件

如何用python写一个简单的12306抢票软件

什么12306

weixin_39653320

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

python爬虫抢火车票_如何用python写一个简单的12306抢票软件|python 爬火车票 教程...

python爬虫抢火车票_如何用python写一个简单的12306抢票软件|python 爬火车票教程...