博主是爬虫菜鸡,对于验证码的识别这一块内容是十分的无力,最近按照崔庆才老师的《python3网络爬虫开发实战进行学习》,由于操作系统与老师的不同,技术更新快等多种因素,光是对老师书中的代码进行修改就要花费大量时间,期间不免要踩许多坑,特写此博客来记录我的一些学习心得,如果此文章对你的学习有所帮助,不妨点个赞呀!O(∩_∩)O
登录魅族官网的三种验证方式:
滑动式验证:
点触或者点选式验证:
直接点击验证:
当我们想要在魅族官网练习点选式验证码的识别时,我们可以事先多进行几次验证(不需要输入账号,密码,直接点击验证按钮,然后刷新,多进行几次,后面的验证就几乎是点选式验证了)。
本文重点记录在截取验证码图片的过程遇到的一些问题:
截取验证码图片时会发生验证码图片坐标偏差,如下图所示:
截取的图片本应该是图片里的验证码图片,但是结果却是:
下面引用selenium使用location定位元素坐标偏差的解释:
python+selenium+Chromedriver使用location定位元素坐标偏差
使用xpath定位元素,用.location获取坐标值,截取网页截图的一部分出现偏差。
之所以会出现这个坐标偏差是因为windows系统下电脑设置的显示缩放比例造成的,location获取的坐标是按显示100%时得到的坐标,而截图所使用的坐标却是需要根据显示缩放比例缩放后对应的图片所确定的,因此就出现了偏差。
下面我按照Win10下用selenium、Image.crop() 截图时、坐标不准确的问题的三种方法都进行了尝试:
三种解决方法解决截图坐标不准确问题:
1.在给crop()参数的时候,全部乘以对应的比列也能准确截图:
top = location['y']*2
left = location['x']*2
bottom = top+size['height']*2
right = left+size['width']*2
截取的图片 :
但是,在后面的操作中,我们要把此图片传到 超级鹰服务平台,他将返回字在原图中的坐标,而selenium是按显示100%时的坐标进行点击,所以会出现:
selenium.common.exceptions.MoveTargetOutOfBoundsException :Message: move target out of bounds
2.执行js,对页面进行缩放:
本台电脑的缩放比例是200%,100/200=0.5
browser.execute_script('document.body.style.zoom="0.5"')
这时候若不对代码进行任何处理,程序就会出现:selenium.common.exceptions.ElementClickInterceptedException:Message:element click intercepted: Element is not clickable at point (x,y),这是因为原来的验证按钮被页面的其他元素覆盖,
此时,我们利用selenium执行js操作:
button = browser.find_element_by_class_name('geetest_radar_btn')
browser.execute_script('arguments[0].click();', button)
截图也没有问题:
所截图片:
这时候把图片传到超级鹰后台后,按照返回的坐标进行点击文字时又会出现和解决方法一 一样的情况:selenium.common.exceptions.MoveTargetOutOfBoundsException: Message: move target out of bounds
3.把这个设置调回100%即可准确截图(问题解决)
在 windows的显示设置中把缩放比例设置成100%即可。
最后成功结果:
源代码:
代码要在显示缩放比为100%时才能成功,为了方便测试,这里我并没有像书中那样声明类,本文的代码拓展性差,大家可根据情况自行调整
进入魅族官网页面,输入账户手机号,密码:
browser = webdriver.Chrome()
url = 'https://login.flyme.cn/'
browser.maximize_window() # ! 一定要将打开的窗口最大化能减少其他bug
browser.get(url)
wait = WebDriverWait(browser, 20)
PHONENUMBER = 'xxxxxxxxxxx'
PASSWORD = 'xxxxxxxxx'
inputPhone = browser.find_element_by_id('account')
inputPassword = browser.find_element_by_id('password')
inputPhone.send_keys(PHONENUMBER)
inputPassword.send_keys(PASSWORD)
点击验证按钮:
button = wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'geetest_radar_tip')))
button.click()
截取验证码图片:
img = wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'geetest_item_img')))
screenshot = browser.get_screenshot_as_png()
screenshot = Image.open(BytesIO(screenshot))
time.sleep(2)
location = img.location
size = img.size
top, bottom, left, right = location['y']-40, location['y']+size['height'], location['x'], location['x']+size['width']
print('验证码位置:', top, bottom, left, right)
captcha1 = screenshot.crop((left, top, right, bottom))
captcha1.save('captcha1.png')
captcha1.show()
登录超级鹰并将验证码图片以二进制形式传给超级鹰后台:
CHAOJIYING_USERNAME = 'xxxxxxxxxx'
CHAOJIYING_PASSWORD = 'xxxxxxxxx'
CHAOJIYING_SOFT_ID = 'xxxxx'
CHAOJIYING_KIND = 9004
bytes_arr = BytesIO()
captcha1.save(bytes_arr, format='png')
chaoJiYing = Chaojiying_Client(CHAOJIYING_USERNAME, CHAOJIYING_PASSWORD, CHAOJIYING_SOFT_ID)
result = chaoJiYing.PostPic(bytes_arr.getvalue(), CHAOJIYING_KIND)
获取验证码中字的坐标,并点击
groups = result.get('pic_str').split('|')
locations = [[int(number) for number in group.split(',')] for group in groups]
for location in locations:
print(location)
ActionChains(browser).move_to_element_with_offset(img, location[0], location[1]).click().perform()
time.sleep(1)
登录魅族官网:
confim = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'geetest_commit')))
confim.click()
success = wait.until(EC.text_to_be_present_in_element((By.CLASS_NAME, 'geetest_success_radar_tip_content'), '验证成功'))
login = wait.until(EC.element_to_be_clickable((By.ID, 'login')))
login.click()
最后的成功代码:
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver import ActionChains
from chaojiying import Chaojiying_Client
from io import BytesIO
from PIL import Image
import time
browser = webdriver.Chrome()
url = 'https://login.flyme.cn/'
browser.maximize_window() # ! 将窗口最大化
browser.get(url)
# browser.execute_script('document.body.style.zoom="0.5"')
wait = WebDriverWait(browser, 20)
PHONENUMBER = 'xxxxxxxx'
PASSWORD = 'xxxxxxx'
inputPhone = browser.find_element_by_id('account')
inputPassword = browser.find_element_by_id('password')
inputPhone.send_keys(PHONENUMBER)
inputPassword.send_keys(PASSWORD)
button = wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'geetest_radar_tip')))
button.click()
# button = browser.find_element_by_class_name('geetest_radar_btn') #wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'geetest_radar_btn')))
# print(button.location)
# browser.execute_script('arguments[0].click();', button)
img = wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'geetest_item_img')))
screenshot = browser.get_screenshot_as_png()
screenshot = Image.open(BytesIO(screenshot))
time.sleep(2)
location = img.location
size = img.size
top, bottom, left, right = location['y']-40, location['y']+size['height'], location['x'], location['x']+size['width']
# top = (location['y']-53)*2
# left = location['x']*2
# bottom = (top+size['height']-5)*2 # ! 用来删去验证码下面的白条
# right = left+size['width']*2
print('验证码位置:', top, bottom, left, right)
captcha1 = screenshot.crop((left, top, right, bottom))
captcha1.save('captcha1.png')
captcha1.show()
CHAOJIYING_USERNAME = 'xxxxxxx'
CHAOJIYING_PASSWORD = 'xxxxxx'
CHAOJIYING_SOFT_ID = 909992
CHAOJIYING_KIND = 9004
bytes_arr = BytesIO()
captcha1.save(bytes_arr, format='png')
chaoJiYing = Chaojiying_Client(CHAOJIYING_USERNAME, CHAOJIYING_PASSWORD, CHAOJIYING_SOFT_ID)
result = chaoJiYing.PostPic(bytes_arr.getvalue(), CHAOJIYING_KIND)
print(result)
groups = result.get('pic_str').split('|')
locations = [[int(number) for number in group.split(',')] for group in groups]
for location in locations:
print(location)
ActionChains(browser).move_to_element_with_offset(img, location[0], location[1]).click().perform()
time.sleep(1)
confim = wait.until(EC.element_to_be_clickable((By.CLASS_NAME, 'geetest_commit')))
confim.click()
success = wait.until(EC.text_to_be_present_in_element((By.CLASS_NAME, 'geetest_success_radar_tip_content'), '验证成功'))
login = wait.until(EC.element_to_be_clickable((By.ID, 'login')))
login.click()