python识别图片验证码

最新推荐文章于 2024-07-31 14:30:10 发布

Vane.Q

最新推荐文章于 2024-07-31 14:30:10 发布

阅读量974

点赞数 2

分类专栏： python 验证码识别文章标签： python 验证码识别

本文链接：https://blog.csdn.net/M18856018695/article/details/101068044

版权

python 同时被 3 个专栏收录

1 篇文章 0 订阅

订阅专栏

验证码

1 篇文章 0 订阅

订阅专栏

识别

1 篇文章 0 订阅

订阅专栏

一、环境准备：

1、python的下载与安装

此处省略

2、配置pip源

打开cmd窗口，输入set，查看自己的USERPROFILE，正常都是C:\Users\admin

在此目录下创建pip文件夹，新建pip.ini文件，，用记事本打开，输入以下内容，配置源

[global]
timeout = 6000
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
trusted-host = pypi.tuna.tsinghua.edu.cn

3、安装pytesseract

下载地址：https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.0.0-beta.1.20180414.exe

在环境变量path中配置pytesseract地址

4、安装需要的依赖

pip insatll requests
pip install pytesseract
pip install pillow

二、获取验证码图片

没有验证码！咋办呢？？？？

这里提供两种获取图片的方式

方式1：网络上获取别人的验证码

代码如下，替换掉url就可以了

import requests
import os
import random

for i in range(5):
    url = '你的地址：' + str(random.random())
    root = "D://pic/"
    path = root + str(random.random()) + '.jpg'
    try:
        if not os.path.exists(root):
            os.mkdir(root)
        if not os.path.exists(path):
            r = requests.get(url)
            r.raise_for_status()
            # 使用with语句可以不用自己手动关闭已经打开的文件流
            with open(path,"wb") as f: #开始写文件，wb代表写二进制文件
                f.write(r.content)
            print("爬取完成")
        else:
            print("文件已存在")
    except Exception as e:
        print("爬取失败:"+str(e))

方式2：自己生成验证码，可能会需要字体文件，使用系统字体也可以

#!/usr/bin/python
#-*-coding:utf-8-*-
from PIL import Image, ImageDraw, ImageFont, ImageFilter
import random
width = 100
height = 50
# 图片颜色
clo = (0, 0, 0)
font = ImageFont.truetype('D:\\python\\test\\arialuni.ttf',36)
for i in range(300):
    image = Image.new('RGB', (width, height), clo)
    draw = ImageDraw.Draw(image)
    # 输出文字:
    str1 = str(round(random.random() * 10000))
    w = 4  #距离图片左边距离
    h = 3 #距离图片上边距离
    draw.text((w, h), str1, font = font)
    image.filter(ImageFilter.BLUR)
    code_name = str1 + '.jpg'
    save_dir = 'E:/file/{}'.format(code_name)
    image.save(save_dir, 'jpeg')
    print("已保存图片: {}".format(save_dir))

三、识别图片

from PIL import Image
from PIL import ImageEnhance
import pytesseract
import os

dir_root = 'E:/file/';
count = 0
i = 0

for root, dirs, files in os.walk(dir_root):
    file_list = files
for files in file_list:
    im = Image.open(dir_root + files)
    im = im.convert('L')
    #亮度增强
    enh_bri = ImageEnhance.Brightness(im)
    brightness = 1.1
    im = enh_bri.enhance(brightness)
    code = pytesseract.image_to_string(im)
    print('[files]:'+ files.split(".")[0] +'  [code]:' + code)
    count = count + 1
    if files.split(".")[0] == code :
        i = i + 1
print('总数：' + str(count) + ',识别正确数：' + str(i) + ';准确率' + str(round(i / count * 100,2)) + '%')

如果运行不了，可以修改tesseract

运行结果如下：

四、提升图片识别率

有点懒，稍等

Vane.Q

关注

2
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
python识别图片验证码

一、环境准备：1、python的下载与安装此处省略2、配置pip源打开cmd窗口，输入set，查看自己的USERPROFILE，正常都是C:\Users\admin在此目录下创建pip文件夹，新建pip.ini文件，，用记事本打开，输入以下内容，配置源[global]timeout = 6000index-url = https://pypi.tuna.tsingh...
复制链接

扫一扫

专栏目录