新手上手Tensorflow之手写数字识别应用（1）

最新推荐文章于 2024-08-17 20:26:25 发布

sunpro518

最新推荐文章于 2024-08-17 20:26:25 发布

阅读量1.4k

点赞数 2

分类专栏： Python 文章标签：深度学习 TensorFlow Mnist

本文链接：https://blog.csdn.net/sunjinshengli/article/details/78681690

版权

Python 专栏收录该内容

21 篇文章 0 订阅

订阅专栏

学深度学习有一段时间了，各种算法研究一通，什么CNN啦，RNN啦，LSTM啦，RCNN啦，各种论文看了一堆。看没看懂且不说（心虚。。），回来我想把训练的模型看看实际效果的时候，才发现TensorFlow的好多基本功能还不会。好吧，还是拿着Mnist数据集搞一波手写数字识别的全流程吧！涉及到通过鼠标输入数字并获取、图像预处理、模型训练和数字预测等。重点是这些步骤中的一些关键的技术的实现细节。新手实践，不当之处多多指点。
本文按照程序的实现过程，主要分为如下几部分：

通过鼠标输入数字并保存
图像预处理
模型训练
通过模型对输入的图片进行识别

整个代码已经传到GitHub：sunpro/HandWritingRecognition-Tensorflow

1. 通过鼠标输入数字并保存

通过opencv的setMouseCalback()函数获取鼠标的行为来获得输入。关键是重写MouseCallback函数。其函数的C的形式如下：

typedef void(* cv::MouseCallback) (int event, int x, int y, int flags, void *userdata)

其中，其参数的意义：

event ： one of the cv::MouseEventTypes constants.（鼠标操作事件的整数代号）
x ： The x-coordinate of the mouse event.（当前鼠标坐标的x坐标）
y：The y-coordinate of the mouse event.（当前鼠标坐标的y坐标）
flags ： one of the cv::MouseEventFlags constants.(鼠标事件标志)
userdata： The optional parameter.

重点区分一下event和flags

cv::MouseEventTypes鼠标操作事件的整数代号，在opencv中，event鼠标事件总共有10中，从0-9部分代表如下:

event	indication	description
EVENT_MOUSEMOVE	indicates that the mouse pointer has moved over the window.	鼠标滑动
EVENT_LBUTTONDOWN	indicates that the left mouse button is pressed.	鼠标左键按下
EVENT_RBUTTONDOWN	indicates that the right mouse button is pressed.	鼠标有键按下
EVENT_MBUTTONDOWN	indicates that the middle mouse button is pressed.	鼠标中键按下
EVENT_LBUTTONUP	indicates that left mouse button is released.	鼠标左键抬起
EVENT_RBUTTONUP	indicates that right mouse button is released.	鼠标右键抬起
EVENT_MBUTTONUP	indicates that middle mouse button is released.	鼠标中键抬起

cv::MouseEventFlags代表鼠标的拖拽事件，以及键盘鼠标联合事件，总共有32种事件，依次如下：

flags	indication	description
EVENT_FLAG_LBUTTON	indicates that the left mouse button is down.	鼠标左键拖拽
EVENT_FLAG_RBUTTON	indicates that the right mouse button is down.	鼠标右键拖拽
EVENT_FLAG_MBUTTON	indicates that the middle mouse button is down.	鼠标中键拖拽
EVENT_FLAG_CTRLKEY	indicates that CTRL Key is pressed.	按Ctrl不放事件
EVENT_FLAG_SHIFTKEY	indicates that SHIFT Key is pressed.	按Shift不放事件
EVENT_FLAG_ALTKEY	indicates that ALT Key is pressed.	按ALT不放事件

通过对比二者的指示内容，不难发现，event 是指瞬时的动作；flags是指长时间的状态；例如，点击了左键，这时候，event就是指点击左键的这一瞬间，之后，就是左键被按住，除非你抬起来（这样就又触发了左键抬起这个动作），否则一直“左键被按下”这个状态。
我们通过鼠标获取输入，自然就是希望获取鼠标左键按下后鼠标移动的轨迹，因此获取轨迹的判断语句如下：

if event == cv2.EVENT_MOUSEMOVE and flag == cv2.EVENT_FLAG_LBUTTON:
…

实现完整代码如下：

'''
./Input.py
处理鼠标事件；
从而获得手写数字！
'''
import cv2;
import numpy as np

# 创建一个空帧，定义(700, 700, 3)画图区域，注意数据类型
frame = np.zeros((600, 600, 3), np.uint8) 

last_measurement = current_measurement = np.array((0, 0), np.float32)

def OnMouseMove(event, x, y, flag, userdata):
    global frame, current_measurement, last_measurement
    if event == cv2.EVENT_LBUTTONDOWN:
        #last_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 当前测量
        current_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 当前测量
        #print('鼠标左键点击事件！')
        #print('x:%d,y:%d'%(x,y),mousedown)
        #cv2.line(frame, (0, 0), (100, 100), (255, 0, 0)) # 蓝色线为测量值     

    if event == cv2.EVENT_MOUSEMOVE and flag == cv2.EVENT_FLAG_LBUTTON: 
        #print('鼠标移动事件！')
        #print('x:%d,y:%d'%(x,y))
        last_measurement = current_measurement # 把当前测量存储为上一次测量
        current_measurement = np.array([[np.float32(x)], [np.float32(y)]]) # 当前测量
        lmx, lmy = last_measurement[0], last_measurement[1] # 上一次测量坐标
        cmx, cmy = current_measurement[0], current_measurement[1] # 当前测量坐标
        #print('lmx:%f.1,lmy:%f.1,cmx:%f.1'%(lmx,lmy,cmx))
        cv2.line(frame, (lmx, lmy), (cmx, cmy), (255, 255, 255), thickness = 8) #输入数字    
        #print(str(event))
#print('start!')
# 窗口初始化
cv2.namedWindow("Input Number:")
#opencv采用setMouseCallback函数处理鼠标事件，具体事件必须由回调（事件）函数的第一个参数来处理，该参数确定触发事件的类型（点击、移动等）
cv2.setMouseCallback("Input Number:", OnMouseMove)
key = 0
while key != ord('q'):
    cv2.imshow("Input Number:", frame)
    key = cv2.waitKey(1) & 0xFF
cv2.imwrite('number.jpg',frame)
#cv2.destroyWindow('Input Number:')
print('number image has been stored and named "number.jpg"')
cv2.destroyAllWindows()