相信做爬虫的很多小伙伴一定会遇到很多滑块验证码的问题。在爬取天眼查的时候是要求需要登陆的。天眼查的滑块验证码可不可以解决呢?答案是一定可以的。今天我们就来聊聊类似天眼查这种滑块验证码的解决方案。
解决滑块验证码的步骤有这么几步。第一:截图。通过各种技术手段截目标图如下:
代码实现片段:
button = driver.find_element_by_xpath('/html/body/div[10]/div[2]/div[2]/div[2]/div[2]')
ActionChains(driver).move_to_element(button).click_and_hold().perform()
image_node = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '/html/body/div[10]/div[2]/div[2]/div[1]/div[2]/div[1]/a[2]/div[1]')))
left = image_node.location["x"]
top = image_node.location["y"]
element_width = image_node.location["x"] + image_node.size['width']
element_height = image_node.location["y"] + image_node.size['height']
driver.save_screenshot("./tianyancha.png")
原理:拖动滑块使得缺口出现,定位图片元素,通过图片的位置和大小,截图。截图效果如下
第二步:计算出滑动距离
代码如下:
def return_distance():
"""
根据截取的验证码图片计算滑动距离
"""
print("调用return_distance 方法-----------------------")
img = Image.open("./yanzhengma.png")
weight, height = img.size
for iin range(65, weight):
for jin range(height):
img = img.convert('RGB')
str_strlist = img.load()
data = str_strlist[i, j]
a, b, c = data
if ain list(range(10))and bin list(range(60, 80)):
print("有返回值-------------")
return i -6
第三步:设计出滑动轨迹,网上有很多中滑动根轨迹这里我也是参考的别人的轨迹,效果还不错。
代码如下:
def get_track(self, distance):
"""
根据偏移量和手动操作模拟计算移动轨迹
:paramdistance: 偏移量
:return: 移动轨迹
"""
# 移动轨迹
tracks = []
# 当前位移
current =0
# 减速阈值
mid = distance *4 /5
# 时间间隔
# t = 0.2
t =0.2
# 初始速度
v =0
while current < distance:
if current < mid:
# a = random.uniform(3, 7)
a = random.uniform(2, 5)
else:
a = -(random.uniform(12.5, 13.5))
# a = -(random.uniform(12.5, 13.5))
v0 = v
v = v0 + a * t
x = v0 * t +1 /2 * a * t * t
current += x
if 0.6 < current - distance <1:
x = x -0.53
tracks.append(round(x, 2))
elif 1 < current - distance <1.5:
x = x -1.4
tracks.append(round(x, 2))
elif 1.5 < current - distance <3:
x = x -1.8
tracks.append(round(x, 2))
else:
tracks.append(round(x, 2))
return tracks
最后:获取滑柄,模拟滑动
def move_to_gap(self, slider, tracks):
"""
将滑块移动至偏移量处
:paramslider: 滑块
:paramtracks: 移动轨迹
:return:
"""
action = ActionChains(self.browser)
action.click_and_hold(slider).perform()
for xin tracks:
# time.sleep(random.randrange(20, 40) / 200)
action.move_by_offset(xoffset=x,yoffset=-1).perform()
action = ActionChains(self.browser)
time.sleep(2)
action.release().perform()
# print(self.browser.current_window_handle)