文章目录
0 前言
🔥这两年开始毕业设计和毕业答辩的要求和难度不断提升,传统的毕设题目缺少创新和亮点,往往达不到毕业答辩的要求,这两年不断有学弟学妹告诉学长自己做的项目系统达不到老师的要求。
为了大家能够顺利以及最少的精力通过毕设,学长分享优质毕业设计项目,今天要分享的是
🚩 毕业设计 基于深度学习的车牌识别(源码分享)
🥇学长这里给一个题目综合评分(每项满分5分)
难度系数:3分
工作量:3分
创新点:4分
🧿 项目分享:见文末!
1 项目运行效果
视频效果:
毕业设计 深度学习车牌识别系统
2 车牌识别原理和流程
车牌识别是基于图像分割和图像识别理论,对含有车辆号牌的图像进行分析处理,从而确定牌照在图像中的位置,并进一步提取和识别出文本字符。
一个典型的车牌识别处理过程包括:图像采集、图像预处理、车牌定位、字符分割、字符识别及结果输出等处理过程。各个处理过程相辅相成,每个处理过程均须保证其高效和较高的抗干扰能力,只有这样才能保证识别功能达到满意的功能品质。
车牌识别系统的实现方式主要分两种,一种为静态图像识别,另一种为动态视频流识别。静态图像识别受限于图像质量、车牌污损度、车牌倾斜度等因素。动态视频流识别则需要更快的识别速度,受限于处理器的性能指标,特别是在移动终端实现车牌实时识别需要更多性能优化。
虽然车牌识别包含6大处理过程,但核心算法主要位于车牌定位、字符分割及字符识别这三个模块中。
2.1 车牌定位
车牌定位的主要工作是从静态图片或视频帧中找到车牌位置,并把车牌从图像中单独分离出来以供后续处理模块处理。车牌定位是影响系统性能的重要因素之一。目前车牌定位的方法很多,但总的来说可以分为两大类:
2.2 字符分割
字符分割的任务是把多列或多行字符图像中的每个字符从整个图像中切割出来成为单个字符图像。传统字符分割算法可以归纳为以下两类类:直接分割法、基于图像形态学的分割法。直接分割法简单,基于一些先验知识,如车牌字符分布情况等,同时辅助一些基本投影算法实现分割;基于形态学的分割方法使用边缘检测、膨胀腐蚀等处理来确定字符图像位置。传统的字符分割算法同样对外界干扰敏感,如车牌倾斜度、字符污损粘连等。车牌字符的正确分割对字符的识别是很关键的,在分割正确的情况下,才能保证识别的准确率。而随着神经网络理论的不断发展,端到端的图片分类识别技术也有很大突破,因此很多OCR软件逐步摆脱传统字符分割处理,由识别网络对多字符进行直接识别。
2.3 字符识别
字符识别是将包含一个或多个字符的图片中提取字符编码的过程。字符识别的典型方法即基于机器学习的图片分类方法。在图片分类方法中,一幅图片只能输出一个分类,也就是说一幅图片中只能包含一个字符图像。这就要求字符分割有很高的准确率。另一种识别方法即端到端的基于循环神经网络的字符识别方法。该方法将整个车牌图片输入网络,神经网络将直接输出所有字符。端到端的方法直接去除了字符分割过程,免去了字符分割错误带来的稳定性损失,但端到端方法同样对其他干扰如车牌倾斜度比较敏感。
3 基于机器学习的车牌识别
前面的车牌检测和字符分割学长这里就不多复述了,这里着重讲解如何使用机器学习中的支持向量机SVM来进行车牌字符识别。
3.1 支持向量机SVM
SVM是支持向量机(Support Vector Machine)的简称,对于解决小样本、非线性、高维的模式识别问题有很多特有的优势。
简单地讲呢,SVM分类算法的实质就是在样本的特征空间中找到一个最优的超平面,使这个超平面离所有的类的样本的距离最小者最大化。
如下图所示,总共有两类,每类的样本数为五,最优超平面即为可以将两类分开,且两类中离分类面最近的样本与分类面的距离最大。
总而言之: SVM实质上是一个类分类器,是一个能够将不同类样本在样本空间分隔的超平面。
3.2 SVM识别字符
定义
class SVM(StatModel):
def __init__(self, C = 1, gamma = 0.5):
self.model = cv2.ml.SVM_create()
self.model.setGamma(gamma)
self.model.setC(C)
self.model.setKernel(cv2.ml.SVM_RBF)
self.model.setType(cv2.ml.SVM_C_SVC)
#训练svm
def train(self, samples, responses):
self.model.train(samples, cv2.ml.ROW_SAMPLE, responses)
调用方法,喂数据
def train_svm(self):
#识别英文字母和数字
self.model = SVM(C=1, gamma=0.5)
#识别中文
self.modelchinese = SVM(C=1, gamma=0.5)
if os.path.exists("svm.dat"):
self.model.load("svm.dat")
训练,保存模型
else:
chars_train = []
chars_label = []
for root, dirs, files in os.walk("train\\chars2"):
if len(os.path.basename(root)) > 1:
continue
root_int = ord(os.path.basename(root))
for filename in files:
filepath = os.path.join(root,filename)
digit_img = cv2.imread(filepath)
digit_img = cv2.cvtColor(digit_img, cv2.COLOR_BGR2GRAY)
chars_train.append(digit_img)
#chars_label.append(1)
chars_label.append(root_int)
chars_train = list(map(deskew, chars_train))
chars_train = preprocess_hog(chars_train)
#chars_train = chars_train.reshape(-1, 20, 20).astype(np.float32)
chars_label = np.array(chars_label)
print(chars_train.shape)
self.model.train(chars_train, chars_label)
车牌字符数据集如下
这些是字母的训练数据,同样的还有我们车牌的省份简写:
核心代码
predict_result = []
roi = None
card_color = None
for i, color in enumerate(colors):
if color in ("blue", "yello", "green"):
card_img = card_imgs[i]
gray_img = cv2.cvtColor(card_img, cv2.COLOR_BGR2GRAY)
#黄、绿车牌字符比背景暗、与蓝车牌刚好相反,所以黄、绿车牌需要反向
if color == "green" or color == "yello":
gray_img = cv2.bitwise_not(gray_img)
ret, gray_img = cv2.threshold(gray_img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
#查找水平直方图波峰
x_histogram = np.sum(gray_img, axis=1)
x_min = np.min(x_histogram)
x_average = np.sum(x_histogram)/x_histogram.shape[0]
x_threshold = (x_min + x_average)/2
wave_peaks = find_waves(x_threshold, x_histogram)
if len(wave_peaks) == 0:
print("peak less 0:")
continue
#认为水平方向,最大的波峰为车牌区域
wave = max(wave_peaks, key=lambda x:x[1]-x[0])
gray_img = gray_img[wave[0]:wave[1]]
#查找垂直直方图波峰
row_num, col_num= gray_img.shape[:2]
#去掉车牌上下边缘1个像素,避免白边影响阈值判断
gray_img = gray_img[1:row_num-1]
y_histogram = np.sum(gray_img, axis=0)
y_min = np.min(y_histogram)
y_average = np.sum(y_histogram)/y_histogram.shape[0]
y_threshold = (y_min + y_average)/5#U和0要求阈值偏小,否则U和0会被分成两半
wave_peaks = find_waves(y_threshold, y_histogram)
#for wave in wave_peaks:
# cv2.line(card_img, pt1=(wave[0], 5), pt2=(wave[1], 5), color=(0, 0, 255), thickness=2)
#车牌字符数应大于6
if len(wave_peaks) <= 6:
print("peak less 1:", len(wave_peaks))
continue
wave = max(wave_peaks, key=lambda x:x[1]-x[0])
max_wave_dis = wave[1] - wave[0]
#判断是否是左侧车牌边缘
if wave_peaks[0][1] - wave_peaks[0][0] < max_wave_dis/3 and wave_peaks[0][0] == 0:
wave_peaks.pop(0)
#组合分离汉字
cur_dis = 0
for i,wave in enumerate(wave_peaks):
if wave[1] - wave[0] + cur_dis > max_wave_dis * 0.6:
break
else:
cur_dis += wave[1] - wave[0]
if i > 0:
wave = (wave_peaks[0][0], wave_peaks[i][1])
wave_peaks = wave_peaks[i+1:]
wave_peaks.insert(0, wave)
#去除车牌上的分隔点
point = wave_peaks[2]
if point[1] - point[0] < max_wave_dis/3:
point_img = gray_img[:,point[0]:point[1]]
if np.mean(point_img) < 255/5:
wave_peaks.pop(2)
if len(wave_peaks) <= 6:
print("peak less 2:", len(wave_peaks))
continue
part_cards = seperate_card(gray_img, wave_peaks)
for i, part_card in enumerate(part_cards):
#可能是固定车牌的铆钉
if np.mean(part_card) < 255/5:
print("a point")
continue
part_card_old = part_card
w = abs(part_card.shape[1] - SZ)//2
part_card = cv2.copyMakeBorder(part_card, 0, 0, w, w, cv2.BORDER_CONSTANT, value = [0,0,0])
part_card = cv2.resize(part_card, (SZ, SZ), interpolation=cv2.INTER_AREA)
#part_card = deskew(part_card)
part_card = preprocess_hog([part_card])
if i == 0:
resp = self.modelchinese.predict(part_card)
charactor = provinces[int(resp[0]) - PROVINCE_START]
else:
resp = self.model.predict(part_card)
charactor = chr(resp[0])
#判断最后一个数是否是车牌边缘,假设车牌边缘被认为是1
if charactor == "1" and i == len(part_cards)-1:
if part_card_old.shape[0]/part_card_old.shape[1] >= 7:#1太细,认为是边缘
continue
predict_result.append(charactor)
roi = card_img
card_color = color
break
return predict_result, roi, card_color#识别到的字符、定位的车牌图像、车牌颜色
识别结果
4 深度学习字符识别
识别阶段是我们的车牌自动检测与识别系统的最后一个环节,识别是基于前面环节得到的单个字符图像。我们的模型将对这些图像进行预测,从而得到最终的车牌号码。
为了尽可能利用训练数据,我们将每个字符单独切割,得到一个车牌字符数据集,该数据集中包含11个类(数字0-9以及阿拉伯单词),每个类包含30~40张字符图像,图像为28X28的PNG格式。
然后,我们就多层感知器MLP和K近邻分类器KNN的比较进行了一些调研,研究结果标明,对于多层感知器而言,如果隐层的神经元增多,那么分类器的性能就会提高;同样,对于KNN而言,性能也是随着近邻数量的增多而提高。不过由于KNN的可调整潜力要远远小于MLP,因此我们最终选择在这个阶段使用多层感知器MLP网络来识别分割后的车牌字符:
网络结构
关键代码
#coding=utf-8
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D,MaxPool2D
from keras.optimizers import SGD
from keras import backend as K
K.set_image_dim_ordering('tf')
import cv2
import numpy as np
index = {"京": 0, "沪": 1, "津": 2, "渝": 3, "冀": 4, "晋": 5, "蒙": 6, "辽": 7, "吉": 8, "黑": 9, "苏": 10, "浙": 11, "皖": 12,
"闽": 13, "赣": 14, "鲁": 15, "豫": 16, "鄂": 17, "湘": 18, "粤": 19, "桂": 20, "琼": 21, "川": 22, "贵": 23, "云": 24,
"藏": 25, "陕": 26, "甘": 27, "青": 28, "宁": 29, "新": 30, "0": 31, "1": 32, "2": 33, "3": 34, "4": 35, "5": 36,
"6": 37, "7": 38, "8": 39, "9": 40, "A": 41, "B": 42, "C": 43, "D": 44, "E": 45, "F": 46, "G": 47, "H": 48,
"J": 49, "K": 50, "L": 51, "M": 52, "N": 53, "P": 54, "Q": 55, "R": 56, "S": 57, "T": 58, "U": 59, "V": 60,
"W": 61, "X": 62, "Y": 63, "Z": 64,"港":65,"学":66 ,"O":67 ,"使":68,"警":69,"澳":70,"挂":71};
chars = ["京", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "皖", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂",
"琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A",
"B", "C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "P",
"Q", "R", "S", "T", "U", "V", "W", "X",
"Y", "Z","港","学","O","使","警","澳","挂" ];
def Getmodel_tensorflow(nb_classes):
# nb_classes = len(charset)
img_rows, img_cols = 23, 23
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
# x = np.load('x.npy')
# y = np_utils.to_categorical(range(3062)*45*5*2, nb_classes)
# weight = ((type_class - np.arange(type_class)) / type_class + 1) ** 3
# weight = dict(zip(range(3063), weight / weight.mean())) # 调整权重,高频字优先
model = Sequential()
model.add(Conv2D(32, (5, 5),input_shape=(img_rows, img_cols,1)))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Conv2D(512, (3, 3)))
# model.add(Activation('relu'))
# model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
# model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model
def Getmodel_ch(nb_classes):
# nb_classes = len(charset)
img_rows, img_cols = 23, 23
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
nb_pool = 2
# convolution kernel size
nb_conv = 3
# x = np.load('x.npy')
# y = np_utils.to_categorical(range(3062)*45*5*2, nb_classes)
# weight = ((type_class - np.arange(type_class)) / type_class + 1) ** 3
# weight = dict(zip(range(3063), weight / weight.mean())) # 调整权重,高频字优先
model = Sequential()
model.add(Conv2D(32, (5, 5),input_shape=(img_rows, img_cols,1)))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPool2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))
model.add(Conv2D(512, (3, 3)))
# model.add(Activation('relu'))
# model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
# model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(756))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model
model = Getmodel_tensorflow(65)
#构建网络
model_ch = Getmodel_ch(31)
model_ch.load_weights("./model/char_chi_sim.h5")
# model_ch.save_weights("./model/char_chi_sim.h5")
model.load_weights("./model/char_rec.h5")
# model.save("./model/char_rec.h5")
def SimplePredict(image,pos):
image = cv2.resize(image, (23, 23))
image = cv2.equalizeHist(image)
image = image.astype(np.float) / 255
image -= image.mean()
image = np.expand_dims(image, 3)
if pos!=0:
res = np.array(model.predict(np.array([image]))[0])
else:
res = np.array(model_ch.predict(np.array([image]))[0])
zero_add = 0 ;
if pos==0:
res = res[:31]
elif pos==1:
res = res[31+10:65]
zero_add = 31+10
else:
res = res[31:]
zero_add = 31
max_id = res.argmax()
return res.max(),chars[max_id+zero_add],max_id+zero_add
识别效果
篇幅有限,更多详细设计见设计论文
5 最后
项目包含内容
🧿 项目分享:见文末!