2_3-numpy-cnn-mnist手写数字识别

numpy实现神经网络系列

工程地址:https://github.com/yizt/numpy_neuron_network

基础知识

0_1-全连接层、损失函数的反向传播

0_2_1-卷积层的反向传播-单通道、无padding、步长1

0_2_2-卷积层的反向传播-多通道、无padding、步长1

0_2_3-卷积层的反向传播-多通道、无padding、步长不为1

0_2_4-卷积层的反向传播-多通道、有padding、步长不为1

0_2_5-池化层的反向传播-MaxPooling、AveragePooling、GlobalAveragePooling、GlobalMaxPooling

0_3-激活函数的反向传播-ReLU、LeakyReLU、PReLU、ELU、SELU

0_4-优化方法-SGD、AdaGrad、RMSProp、Adadelta、Adam

DNN练习

1_1_1-全连接神经网络做线性回归

1_1_2-全连接神经网络做mnist手写数字识别

CNN练习

2_1-numpy卷积层实现

2_2-numpy池化层实现

2_3-numpy-cnn-mnist手写数字识别

本文目录

一、定义前向、后向传播过程

本文将用numpy实现cnn, 并测试mnist手写数字识别

如果对神经网络的反向传播过程还有不清楚的,可以参考 全连接层、损失函数的反向传播卷积层的反向传播池化层的反向传播

网络结构如下,包括1个卷积层,1个最大池化层,1个打平层2个全连接层:

input(1,28*28)=> conv(1,3,3) => relu => max pooling => flatten => fc(64) => relu => fc(10)

这里定义卷积层只有一个输出通道,全连接层的神经元也只有64个神经元;主要是由于纯numpy的神经网络比较慢,本文主要目的理解神经网络的反向传播过程,以及如何用numpy实现神经网络,因此不追求设计最合适的网络结构;numpy实现的实现的神经网络无法应用到实际的项目中,请使用深度学习框架(如:Tensorflow、Keras、Caffe等)

# 定义权重、神经元、梯度
import numpy as np
weights = {}
weights_scale = 1e-2
filters = 1
fc_units=64
weights["K1"] = weights_scale * np.random.randn(1, filters, 3, 3).astype(np.float64)
weights["b1"] = np.zeros(filters).astype(np.float64)
weights["W2"] = weights_scale * np.random.randn(filters * 13 * 13, fc_units).astype(np.float64)
weights["b2"] = np.zeros(fc_units).astype(np.float64)
weights["W3"] = weights_scale * np.random.randn(fc_units, 10).astype(np.float64)
weights["b3"] = np.zeros(10).astype(np.float64)

# 初始化神经元和梯度
nuerons={}
gradients={}
# 定义前向传播和反向传播
from nn.layers import conv_backward,fc_forward,fc_backward
from nn.layers import flatten_forward,flatten_backward
from nn.activations import relu_forward,relu_backward
from nn.losses import cross_entropy_loss

import pyximport
pyximport.install()
from nn.clayers import conv_forward,max_pooling_forward,max_pooling_backward



# 定义前向传播
def forward(X):
    nuerons["conv1"]=conv_forward(X.astype(np.float64),weights["K1"],weights["b1"])
    nuerons["conv1_relu"]=relu_forward(nuerons["conv1"])
    nuerons["maxp1"]=max_pooling_forward(nuerons["conv1_relu"].astype(np.float64),pooling=(2,2))

    nuerons["flatten"]=flatten_forward(nuerons["maxp1"])

    nuerons["fc2"]=fc_forward(nuerons["flatten"],weights["W2"],weights["b2"])
    nuerons["fc2_relu"]=relu_forward(nuerons["fc2"])

    nuerons["y"]=fc_forward(nuerons["fc2_relu"],weights["W3"],weights["b3"])

    return nuerons["y"]

# 定义反向传播
def backward(X,y_true):
    loss,dy=cross_entropy_loss(nuerons["y"],y_true)
    gradients["W3"],gradients["b3"],gradients["fc2_relu"]=fc_backward(dy,weights["W3"],nuerons["fc2_relu"])
    gradients["fc2"]=relu_backward(gradients["fc2_relu"],nuerons["fc2"])

    gradients["W2"],gradients["b2"],gradients["flatten"]=fc_backward(gradients["fc2"],weights["W2"],nuerons["flatten"])

    gradients["maxp1"]=flatten_backward(gradients["flatten"],nuerons["maxp1"])

    gradients["conv1_relu"]=max_pooling_backward(gradients["maxp1"].astype(np.float64),nuerons["conv1_relu"].astype(np.float64),pooling=(2,2))
    gradients["conv1"]=relu_backward(gradients["conv1_relu"],nuerons["conv1"])
    gradients["K1"],gradients["b1"],_=conv_backward(gradients["conv1"],weights["K1"],X)
    return loss
# 获取精度
def get_accuracy(X,y_true):
    y_predict=forward(X)
    return np.mean(np.equal(np.argmax(y_predict,axis=-1),
                            np.argmax(y_true,axis=-1)))

二、加载数据

mnist.pkl.gz数据源: http://deeplearning.net/data/mnist/mnist.pkl.gz

from nn.load_mnist import load_mnist_datasets
from nn.utils import to_categorical
train_set, val_set, test_set = load_mnist_datasets('mnist.pkl.gz')
train_x,val_x,test_x=np.reshape(train_set[0],(-1,1,28,28)),np.reshape(val_set[0],(-1,1,28,28)),np.reshape(test_set[0],(-1,1,28,28))
train_y,val_y,test_y=to_categorical(train_set[1]),to_categorical(val_set[1]),to_categorical(test_set[1])
# 随机选择训练样本
train_num = train_x.shape[0]
def next_batch(batch_size):
    idx=np.random.choice(train_num,batch_size)
    return train_x[idx],train_y[idx]

x,y= next_batch(16)
print("x.shape:{},y.shape:{}".format(x.shape,y.shape))
x.shape:(16, 1, 28, 28),y.shape:(16, 10)

三、训练网络

由于numpy卷积层层前向、后向过程较慢,这里只迭代2000步,mini-batch设置为2;实际只训练了4000个样本(也有不错的精度,增加迭代次数精度会继续提升;增加卷积层输出通道数,精度上限也会提升);总样本有5w个。


from nn.optimizers import SGD
# 初始化变量
batch_size=2
steps = 2000

# 更新梯度
sgd=SGD(weights,lr=0.01,decay=1e-6)

for s in range(steps):
    X,y=next_batch(batch_size)

    # 前向过程
    forward(X)
    # 反向过程
    loss=backward(X,y)


    sgd.iterate(weights,gradients)

    if s % 100 ==0:
        print("\n step:{} ; loss:{}".format(s,loss))
        idx=np.random.choice(len(val_x),200)
        print(" train_acc:{};  val_acc:{}".format(get_accuracy(X,y),get_accuracy(val_x[idx],val_y[idx])))

print("\n final result test_acc:{};  val_acc:{}".
      format(get_accuracy(test_x,test_y),get_accuracy(val_x,val_y)))
 step:0 ; loss:2.3025710785961633
 train_acc:0.5;  val_acc:0.105

 step:100 ; loss:2.322658576777174
 train_acc:0.0;  val_acc:0.135

 step:200 ; loss:2.2560641373902453
 train_acc:0.0;  val_acc:0.15

 step:300 ; loss:2.1825470524006914
 train_acc:1.0;  val_acc:0.105

 step:400 ; loss:2.208445091755495
 train_acc:0.0;  val_acc:0.12

 step:500 ; loss:1.413758817626698
 train_acc:0.5;  val_acc:0.475

 step:600 ; loss:0.8138671602711395
 train_acc:1.0;  val_acc:0.605

 step:700 ; loss:0.040969240382020794
 train_acc:1.0;  val_acc:0.695

 step:800 ; loss:0.2943919590130214
 train_acc:1.0;  val_acc:0.8

 step:900 ; loss:0.7937038773889639
 train_acc:0.5;  val_acc:0.775

 step:1000 ; loss:0.20416262923266468
 train_acc:1.0;  val_acc:0.82

 step:1100 ; loss:3.492562642433139
 train_acc:0.5;  val_acc:0.755

 step:1200 ; loss:0.44327566847604044
 train_acc:1.0;  val_acc:0.81

 step:1300 ; loss:0.381620659555296
 train_acc:1.0;  val_acc:0.78

 step:1400 ; loss:0.1379428630137357
 train_acc:1.0;  val_acc:0.715

 step:1500 ; loss:0.0048211652445979145
 train_acc:1.0;  val_acc:0.78

 step:1600 ; loss:0.6156347089073209
 train_acc:1.0;  val_acc:0.78

 step:1700 ; loss:2.9270997739154003
 train_acc:0.5;  val_acc:0.84

 step:1800 ; loss:0.7148056981166203
 train_acc:1.0;  val_acc:0.845

 step:1900 ; loss:3.3810034206400825
 train_acc:0.5;  val_acc:0.745

 final result test_acc:0.8279;  val_acc:0.839
# 随机查看预测结果
import matplotlib.pyplot as plt

idx=np.random.choice(test_x.shape[0],3)
x,y=test_x[idx],test_y[idx]
y_predict = forward(x)
for i in range(3):
    plt.figure(figsize=(3,3))
    plt.imshow(np.reshape(x[i],(28,28)))
    plt.show()
    print("y_true:{},y_predict:{}".format(np.argmax(y[i]),np.argmax(y_predict[i])))

这里写图片描述

y_true:0,y_predict:0

这里写图片描述

y_true:4,y_predict:4

这里写图片描述

y_true:8,y_predict:8
  • 7
    点赞
  • 27
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
好的,以下是基于TensorFlow的CNN实现Mnist手写数字识别的代码,带有可视化界面。 首先需要安装必要的库,包括TensorFlow、Tkinter和Pillow: ``` pip install tensorflow pip install tkinter pip install pillow ``` 然后,我们需要下载Mnist数据集。可以在TensorFlow的官方GitHub页面找到下载链接,或者使用以下代码下载: ``` from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) ``` 接下来,我们可以开始构建CNN模型。下面的代码展示了一个简单的CNN模型: ``` import tensorflow as tf # Define parameters learning_rate = 0.001 training_iters = 20000 batch_size = 128 display_step = 10 # Network parameters n_input = 784 n_classes = 10 dropout = 0.75 # Create placeholders x = tf.placeholder(tf.float32, [None, n_input]) y = tf.placeholder(tf.float32, [None, n_classes]) keep_prob = tf.placeholder(tf.float32) # Create convnet def conv2d(x, W, b, strides=1): x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME') x = tf.nn.bias_add(x, b) return tf.nn.relu(x) def maxpool2d(x, k=2): return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME') def conv_net(x, weights, biases, dropout): x = tf.reshape(x, shape=[-1, 28, 28, 1]) conv1 = conv2d(x, weights['wc1'], biases['bc1']) conv1 = maxpool2d(conv1, k=2) conv2 = conv2d(conv1, weights['wc2'], biases['bc2']) conv2 = maxpool2d(conv2, k=2) fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]]) fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1']) fc1 = tf.nn.relu(fc1) fc1 = tf.nn.dropout(fc1, dropout) out = tf.add(tf.matmul(fc1, weights['out']), biases['out']) return out # Initialize weights and biases weights = { 'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])), 'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])), 'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])), 'out': tf.Variable(tf.random_normal([1024, n_classes])) } biases = { 'bc1': tf.Variable(tf.random_normal([32])), 'bc2': tf.Variable(tf.random_normal([64])), 'bd1': tf.Variable(tf.random_normal([1024])), 'out': tf.Variable(tf.random_normal([n_classes])) } # Construct model pred = conv_net(x, weights, biases, keep_prob) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) ``` 接下来,我们可以开始训练模型,同时在训练过程中使用Tkinter创建一个可视化界面,用于展示模型的训练过程和识别结果。以下是完整的代码: ``` import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data import tkinter as tk from PIL import Image, ImageDraw # Define parameters learning_rate = 0.001 training_iters = 20000 batch_size = 128 display_step = 10 # Network parameters n_input = 784 n_classes = 10 dropout = 0.75 # Create placeholders x = tf.placeholder(tf.float32, [None, n_input]) y = tf.placeholder(tf.float32, [None, n_classes]) keep_prob = tf.placeholder(tf.float32) # Create convnet def conv2d(x, W, b, strides=1): x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME') x = tf.nn.bias_add(x, b) return tf.nn.relu(x) def maxpool2d(x, k=2): return tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME') def conv_net(x, weights, biases, dropout): x = tf.reshape(x, shape=[-1, 28, 28, 1]) conv1 = conv2d(x, weights['wc1'], biases['bc1']) conv1 = maxpool2d(conv1, k=2) conv2 = conv2d(conv1, weights['wc2'], biases['bc2']) conv2 = maxpool2d(conv2, k=2) fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]]) fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1']) fc1 = tf.nn.relu(fc1) fc1 = tf.nn.dropout(fc1, dropout) out = tf.add(tf.matmul(fc1, weights['out']), biases['out']) return out # Initialize weights and biases weights = { 'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])), 'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])), 'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])), 'out': tf.Variable(tf.random_normal([1024, n_classes])) } biases = { 'bc1': tf.Variable(tf.random_normal([32])), 'bc2': tf.Variable(tf.random_normal([64])), 'bd1': tf.Variable(tf.random_normal([1024])), 'out': tf.Variable(tf.random_normal([n_classes])) } # Construct model pred = conv_net(x, weights, biases, keep_prob) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Start training init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) step = 1 while step * batch_size < training_iters: batch_x, batch_y = mnist.train.next_batch(batch_size) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout}) if step % display_step == 0: acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y, keep_prob: 1.}) print("Step " + str(step*batch_size) + ", Training Accuracy= " + "{:.5f}".format(acc)) step += 1 print("Optimization Finished!") # Create Tkinter GUI root = tk.Tk() root.title("Mnist Digit Recognition") # Create canvas for drawing canvas_width = 200 canvas_height = 200 canvas = tk.Canvas(root, width=canvas_width, height=canvas_height, bg="white") canvas.pack() # Create PIL image for drawing image = Image.new("L", (canvas_width, canvas_height), 0) draw = ImageDraw.Draw(image) # Define function for classifying drawn digit def classify_digit(): # Resize image to 28x28 digit_image = image.resize((28, 28)) # Convert image to numpy array digit_array = tf.keras.preprocessing.image.img_to_array(digit_image) digit_array = digit_array.reshape((1, 784)) digit_array = digit_array.astype('float32') digit_array /= 255 # Classify digit using trained model prediction = sess.run(tf.argmax(pred, 1), feed_dict={x: digit_array, keep_prob: 1.}) # Display prediction prediction_label.config(text="Prediction: " + str(prediction[0])) # Define function for clearing canvas def clear_canvas(): canvas.delete("all") draw.rectangle((0, 0, canvas_width, canvas_height), fill=0) # Create buttons and labels classify_button = tk.Button(root, text="Classify", command=classify_digit) classify_button.pack(side="top") clear_button = tk.Button(root, text="Clear", command=clear_canvas) clear_button.pack(side="top") prediction_label = tk.Label(root, text="") prediction_label.pack(side="bottom") # Define canvas event handlers def on_left_button_down(event): canvas.bind("<B1-Motion>", on_mouse_move) def on_left_button_up(event): canvas.unbind("<B1-Motion>") def on_mouse_move(event): x, y = event.x, event.y canvas.create_oval(x-10, y-10, x+10, y+10, fill="black") draw.ellipse((x-10, y-10, x+10, y+10), fill=255) canvas.bind("<Button-1>", on_left_button_down) canvas.bind("<ButtonRelease-1>", on_left_button_up) root.mainloop() ``` 在训练过程中,程序会打印出每个batch的训练准确率。在训练完成后,程序会创建一个Tkinter窗口,包含一个用于绘制手写数字的画布、一个用于清除画布的按钮、一个用于识别手写数字并显示结果的按钮,以及一个用于显示识别结果的标签。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值