一、前言
安装包依赖
requirements.txt 这里面比较全可按需安装
tensorflow==2.9.1
Pillow==9.1.1
requests
numpy==1.23.2
opencv-python >=4.5.4, <4.6
torch==1.10.0
安装命令
pip install -r.txt requirements
1.1 输入要求
- 将训练集和验证集分别放到配置文件指定的目录中
- 目录中所有图片尺寸相同
- 图片命名规则 验证码_编号.图片格式, 举例 abce_01.jpg
1.2 配置文件
- 默认文件 captcha.json
- 字段见文知义
1.3 项目结构
venv:虚拟环境,各个电脑因人而异
1.4 训练
python captcha.py
1.5 预测
predictor = Predictor()
# 预测本地磁盘文件
predictor.predict('xxx.jpg')
# 直接二进制内容预测
predictor.predict_single_image_content(b'PNGxxxxx')
# 预测远程图片
predictor.predict_remote_image('http://xxxxxx/xx.jpg', save_image_to_file='remote.jpg')
1.6 效果
二、项目文件
captcha.json
{
"image_height": 45,
"image_width": 125,
"fixed_length": 4,
"batch_size": 128,
"save_path": "model\\model.dat",
"labels": "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ",
"train_image_dir": "train_images",
"validation_image_dir": "validation_images",
"learning_rate": 0.0001,
"dropout_rate": 0.25,
"epochs": 100
}
2.1 配置文件字段解释
1. image_height
和 image_width
-
含义:定义输入图片的高度和宽度。
2. fixed_length
- 含义:验证码的固定长度,即每个验证码包含的字符数。
3. batch_size
-
含义:训练时每个批次的样本数量。
4. save_path
-
含义:模型权重保存的路径。
5. labels
-
含义:验证码中可能出现的所有字符集合。
6. train_image_dir
和 validation_image_dir
-
含义:
-
train_image_dir
:存放训练图片的文件夹路径。 -
validation_image_dir
:存放验证图片的文件夹路径。
-
7. learning_rate
-
含义:模型训练的学习率。
8. dropout_rate
-
含义:Dropout 层的比例,用于防止过拟合。
9. epochs
-
含义:训练的总轮数。
2.2 train_image_dir
文件夹图片示例
2.3 validation_image_dir文件夹图片实例
2.4 训练代码captcha.py
# -*- coding: utf-8 -*-
"""
CNN训练定长字符验证码识别模型
"""
import json
import io
import os
import time
import keras_preprocessing.image
import numpy as np
import PIL.Image
import requests
import tensorflow as tf
from tensorflow import keras
def label_to_array(text, labels):
"""
转换成向量
:param text: 验证码
:param labels: 验证码所有可能字符集合
:return: numpy array
"""
hots = np.zeros(shape=(len(labels) * len(text)))
for i, char in enumerate(text):
index = i * len(labels) + labels.index(char)
hots[index] = 1
return hots
def array_to_label(array, labels):
"""
向量转换成label
:param array: numpy array
:param labels: label
:return: label string
"""
text = []
for index in array:
text.append(labels[index])
return ''.join(text)
def load_image_data(image_dir_path, image_height, image_width, labels, target_label_length):
"""
加载图片数据
图片标签从图片文件名中读取 图片文件名应该符合 label_xxxx.jpg(png)格式
RGB图片将会转换成灰度图片
:param image_dir_path: 图片路径
:param image_height: 图片高度
:param image_width: 图片宽度
:param labels: 所有标签
:param target_label_length: 图片标签固定长度
:return: image_data, data_label
"""
image_name_list = os.listdir(image_dir_path)
image_data = np.zeros(shape=(len(image_name_list), image_height, image_width, 1))
label_data = np.zeros(shape=(len(image_name_list), len(labels) * target_label_length))
for index, image_name in enumerate(image_name_list):
img = keras_preprocessing.image.utils.load_img(os.path.join(image_dir_path, image_name), color_mode='grayscale')
x = keras_preprocessing.image.utils.img_to_array(img)
y = label_to_array(image_name.split('_')[0], labels)
if hasattr(img, 'close'):
img.close()
image_data[index] = x
label_data[index] = y
return image_data, label_data
class FixCaptchaLengthModel(object):
"""
定长验证码模型
Attributes:
image_height: 高度
image_width: 宽度
learning_rate: 学习率
dropout: dropout比例
label_number: 所有可能字符的种类数量
fixed_length: 验证码的固定长度
"""
def __init__(self, image_height, image_width, label_number, fixed_length,
learning_rate=0.0001, dropout=0.25):
self.image_height = image_height
self.image_width = image_width
# 这里固定转化为灰度图像
self.image_channel = 1
self.learning_rate = learning_rate
self.dropout = dropout
self.label_number = label_number
self.fixed_length = fixed_length
self.kernel_size = (3, 3)
self.pool_size = (2, 2)
self.padding = 'valid'
self.activation = 'relu'
def model(self):
"""
:return: keras.Sequential instance
"""
model = keras.Sequential()
# 输入层
input = keras.Input(shape=(self.image_height, self.image_width, self.image_channel), batch_size=None)
model.add(input)
# 第一层 卷积
model.add(keras.layers.Convolution2D(filters=32, kernel_size=self.kernel_size, strides=1, padding=self.padding,
activation=self.activation))
model.add(keras.layers.MaxPooling2D(pool_size=self.pool_size, strides=self.pool_size))
model.add(keras.layers.Dropout(rate=self.dropout))
# 第二层 卷积
model.add(keras.layers.Convolution2D(filters=64, kernel_size=self.kernel_size, strides=1, padding=self.padding,
activation=self.activation))
model.add(keras.layers.MaxPooling2D(pool_size=self.pool_size, strides=self.pool_size))
model.add(keras.layers.Dropout(rate=self.dropout))
# 第三层 卷积
model.add(keras.layers.Convolution2D(filters=128, kernel_size=self.kernel_size, strides=1, padding=self.padding,
activation=self.activation))
model.add(keras.layers.MaxPooling2D(pool_size=self.pool_size, strides=self.pool_size))
model.add(keras.layers.Dropout(rate=self.dropout))
model.add(keras.layers.Flatten())
# 第四层 全连接
model.add(keras.layers.Dense(units=1024, activation=self.activation))
model.add(keras.layers.Dropout(rate=self.dropout))
# 第五层 全连接
model.add(keras.layers.Dense(units=self.fixed_length * self.label_number, activation="sigmoid"))
model.compile(optimizer=keras.optimizers.Adam(learning_rate=self.learning_rate), loss="binary_crossentropy",
metrics=["binary_accuracy"])
return model
# def load_from_disk(self, model_file_path):
# """
# 从磁盘加载已经训练好的模型
# :param model_file_path: 模型文件路径
# :return: keras.Sequential
# """
# if not os.path.exists(model_file_path):
# raise Exception('%s do not exists' % model_file_path)
# model = self.model()
# model.load_weights(model_file_path)
# return model
def load_from_disk(self, model_file_path):
"""
从磁盘加载已经训练好的模型
:param model_file_path: 模型文件路径(通常是 .index 文件的路径)
:return: keras.Model
"""
# 检查索引文件是否存在
index_path = model_file_path + ".index"
if not os.path.exists(index_path):
raise FileNotFoundError(f"Model index file not found at path: {index_path}")
print(f"Loading model weights from {model_file_path}")
model = self.model() # 构建模型结构
# 加载权重
model.load_weights(model_file_path)
print("Model weights loaded successfully")
return model
class CheckAccuracyCallback(keras.callbacks.Callback):
"""
检查上一轮的训练准确率
"""
def __init__(self, train_x, train_y, validation_x, validation_y, label_number, fixed_label_length, batch_size=128):
super(CheckAccuracyCallback, self).__init__()
self.train_x = train_x
self.train_y = train_y
self.validation_x = validation_x
self.validation_y = validation_y
self.label_number = label_number
self.fixed_label_length = fixed_label_length
self.batch_size = batch_size
# def _compare_accuracy(self, data_x, data_y):
# predict_y = self.model.predict_on_batch(data_x)
# predict_y = keras.backend.reshape(predict_y, [len(data_x), self.fixed_label_length, self.label_number])
# data_y = keras.backend.reshape(data_y, [len(data_y), self.fixed_label_length, self.label_number])
# equal_result = keras.backend.equal(keras.backend.argmax(predict_y, axis=2),
# keras.backend.argmax(data_y, axis=2))
# return keras.backend.mean(keras.backend.min(keras.backend.cast(equal_result, tf.float32), axis=1))
def _compare_accuracy(self, data_x, data_y):
# 预测结果
predict_y = self.model.predict(data_x)
predict_y = tf.reshape(predict_y, [len(data_x), self.fixed_label_length, self.label_number])
data_y = tf.reshape(data_y, [len(data_y), self.fixed_label_length, self.label_number])
# 获取预测和真实标签的 argmax
predict_labels = tf.argmax(predict_y, axis=2)
true_labels = tf.argmax(data_y, axis=2)
# 比较预测和真实标签
correct_predictions = tf.equal(predict_labels, true_labels)
# 计算每个样本的准确率(所有字符都预测正确才算一个样本正确)
sample_accuracy = tf.reduce_all(correct_predictions, axis=1)
# 计算平均准确率
accuracy = tf.reduce_mean(tf.cast(sample_accuracy, tf.float32))
return accuracy
# def on_epoch_end(self, epoch, logs=None):
# print('\nEpoch %s with logs: %s' % (epoch, logs))
# # 选择一个batch并计算准确率
# batches = (len(self.train_x) + self.batch_size - 1) / self.batch_size
# target_batch = (epoch + 1) % batches
# batch_start = int((target_batch - 1) * self.batch_size)
# batch_x = self.train_x[batch_start: batch_start + self.batch_size]
# batch_y = self.train_y[batch_start: batch_start + self.batch_size]
# on_train_batch_acc = self._compare_accuracy(batch_x, batch_y)
# print('Epoch %s with image accuracy on train batch: %s' % (epoch, keras.backend.eval(on_train_batch_acc)))
# on_test_batch_acc = self._compare_accuracy(self.validation_x, self.validation_y)
# print('Epoch %s with image accuracy on validation: %s\n' % (epoch, keras.backend.eval(on_test_batch_acc)))
def on_epoch_end(self, epoch, logs=None):
print(f'\nEpoch {epoch} with logs: {logs}')
# 计算训练批次的准确率
batch_start = 0
batch_x = self.train_x[batch_start: batch_start + self.batch_size]
batch_y = self.train_y[batch_start: batch_start + self.batch_size]
on_train_batch_acc = self._compare_accuracy(batch_x, batch_y)
print(f'Epoch {epoch} with image accuracy on train batch: {on_train_batch_acc.numpy()}')
# 计算验证集的准确率
on_test_batch_acc = self._compare_accuracy(self.validation_x, self.validation_y)
print(f'Epoch {epoch} with image accuracy on validation: {on_test_batch_acc.numpy()}\n')
class Config(object):
def __init__(self, **kwargs):
self.image_height = kwargs['image_height']
self.image_width = kwargs['image_width']
self.fixed_length = kwargs['fixed_length']
self.train_batch_size = kwargs['batch_size']
self.model_save_path = kwargs['save_path']
self.labels = kwargs['labels']
self.train_image_dir = kwargs['train_image_dir']
self.validation_image_dir = kwargs['validation_image_dir']
self.learning_rate = kwargs['learning_rate']
self.dropout_rate = kwargs['dropout_rate']
self.epochs = kwargs['epochs']
@staticmethod
def load_configs_from_json_file(file_path='fixed_length_captcha.json'):
"""
:param file_path: file path
:return: dict instance
"""
with open(file_path, 'r') as fd:
config_content = fd.read()
return Config(**json.loads(config_content))
class Predictor(object):
"""
预测器
"""
def __init__(self, config_file_path='fixed_length_captcha.json'):
self.config = Config.load_configs_from_json_file(config_file_path)
self.model = FixCaptchaLengthModel(self.config.image_height, self.config.image_width, len(self.config.labels),
self.config.fixed_length, learning_rate=self.config.learning_rate,
dropout=self.config.dropout_rate).load_from_disk(self.config.model_save_path)
self.label_number = len(self.config.labels)
def predict(self, image_file_path):
"""
预测单张图片
:param image_file_path: 单张图片文件路径
:return: predict text
"""
with open(image_file_path, 'rb') as f:
return self.predict_single_image_content(f.read())
def predict_remote_image(self, remote_image_url, headers=None, timeout=30, save_image_to_file=None):
"""
预测远程图片
:param remote_image_url: 远程图片URL
:param headers: 请求头
:param timeout: 超时时间
:param save_image_to_file: 是否保存图片到文件
:return: predict text
"""
response = requests.get(remote_image_url, headers=headers, timeout=timeout, stream=True)
content = response.content
if save_image_to_file is not None:
with open(save_image_to_file, 'wb') as fd:
fd.write(content)
return self.predict_single_image_content(content)
def predict_single_image_content(self, image_content):
"""
预测单张图片
:param image_content: byte content
:return: predict text
"""
p_image = PIL.Image.open(io.BytesIO(image_content))
if p_image.mode not in ('L', 'I;16', 'I'):
p_image = p_image.convert('L')
image_data = np.zeros(shape=(1, self.config.image_height, self.config.image_width, 1))
image_data[0] = keras_preprocessing.image.img_to_array(p_image)
if hasattr(p_image, 'close'):
p_image.close()
result = self.model.predict_on_batch(image_data)
result = keras.backend.reshape(result, [1, self.config.fixed_length, self.label_number])
result = keras.backend.argmax(result, axis=2)
return array_to_label(keras.backend.eval(result)[0], self.config.labels)
def train():
"""
训练
"""
config = Config.load_configs_from_json_file()
train_x, train_y = load_image_data(config.train_image_dir, config.image_height, config.image_width,
config.labels, config.fixed_length)
validation_x, validation_y = load_image_data(config.validation_image_dir, config.image_height, config.image_width,
config.labels, config.fixed_length)
print('total train image number: %s' % len(train_x))
print('total validation image number: %s' % len(train_y))
model = FixCaptchaLengthModel(config.image_height, config.image_width, len(config.labels), config.fixed_length,
learning_rate=config.learning_rate, dropout=config.dropout_rate)
if os.path.exists(config.model_save_path):
model = model.load_from_disk(config.model_save_path)
else:
model = model.model()
callbacks = [
keras.callbacks.ModelCheckpoint(filepath=config.model_save_path, save_weights_only=True, save_best_only=True),
CheckAccuracyCallback(train_x, train_y, validation_x, validation_y, len(config.labels), config.fixed_length,
batch_size=config.train_batch_size)
]
model.fit(train_x, train_y, batch_size=config.train_batch_size, epochs=config.epochs,
validation_data=(validation_x, validation_y), callbacks=callbacks)
if __name__ == '__main__':
start_time = time.time()
train()
# predictor = Predictor()
# # 预测本地磁盘文件
# image_path = r'C:\Users\Administrator\Desktop\get.jpg'
# ret = predictor.predict(image_path)
# print(ret)
end_time = time.time()
print('total time: %s' % (end_time - start_time))
2.5 代码的主要功能和结构的概述:
代码功能
-
数据预处理:
-
label_to_array
:将验证码文本标签转换为独热编码向量。 -
array_to_label
:将独热编码向量转换回文本标签。 -
load_image_data
:从指定目录加载图片数据,并将RGB图片转换为灰度图片,同时从文件名中提取标签。
-
-
模型定义:
-
FixCaptchaLengthModel
:定义了一个卷积神经网络(CNN)模型,用于定长验证码的识别。 -
模型包含多层卷积、池化和全连接层,最终输出验证码的预测结果。
-
-
训练过程:
-
train
函数:加载训练和验证数据,初始化模型,使用回调函数(如CheckAccuracyCallback
)监控训练过程,并保存最佳模型。
-
-
预测功能:
-
Predictor
类:加载训练好的模型,提供单张图片或远程图片的预测功能。 -
支持从本地文件或远程URL加载图片,并输出预测的验证码文本。
-
-
配置管理:
-
Config
类:通过JSON文件加载模型训练和预测的配置参数,如图片尺寸、标签集合、学习率等。
-
代码结构
-
工具函数:
-
label_to_array
和array_to_label
用于标签和向量之间的转换。 -
load_image_data
用于加载和预处理图片数据。
-
-
模型类:
-
FixCaptchaLengthModel
定义了CNN模型的结构和训练参数。 -
提供
model
方法构建模型,load_from_disk
方法加载已保存的模型权重。
-
-
回调类:
-
CheckAccuracy
Callback:在每个训练周期结束时,计算并打印训练批次和验证集的准确率。
-
-
配置类:
-
Config
类用于加载和管理训练和预测的配置参数。 -
支持从JSON文件加载配置。
-
-
预测类:
-
Predictor
类用于加载模型并进行预测。 -
提供本地图片和远程图片的预测功能。
-
-
主函数:
-
train
函数用于启动模型训练。 -
示例代码中还展示了如何使用
Predictor
类进行预测。
-
注意事项
依赖库:代码依赖TensorFlow、Keras、Pillow、NumPy等库,需要确保这些库已正确安装。
-
数据格式:
-
图片文件名应符合
label_xxxx.jpg
或label_xxxx.png
格式,其中label
是验证码文本。 -
训练和验证图片应分别存放在指定目录中。
-
-
模型保存路径:
-
模型权重将保存到配置文件中指定的路径。
-
-
预测功能:
-
预测功能支持本地图片和远程图片,远程图片通过URL加载。
-
-
配置文件:
-
配置文件(如
fixed_length_captcha.json
)应包含训练和预测所需的参数,如图片尺寸、标签集合、学习率等。
-
三、模型调用
import captcha
# 图片路径
image_path = r'C:\Users\Administrator\Desktop\get.jpg'
predictor = captcha.Predictor()
# 预测本地磁盘文件
predictor.predict(image_path)
# # 直接二进制内容预测
# predictor.predict_single_image_content(b'PNGxxxxx')
# # 预测远程图片
# predictor.predict_remote_image('http://xxxxxx/xx.jpg', save_image_to_file='remote.jpg')
四、代码下载地址:
train_image_dir:训练图片需自行添加,越多越好(2万以上最佳)
validation_image_dir:验证图片一个人情况添加
链接: https://pan.baidu.com/s/1oILiNgWrz14CRDrV1kNkww?pwd=jz2i 提取码: jz2i 复制这段内容后打开百度网盘手机App,操作更方便哦