【启动-机器学习】基于Keras的 准确率从10%到99%的手写体识别案例的优化


ps:该博客是对up主在 b站视频的文字和代码记录

一、不涉及卷积层

准备工作 导入包

import keras
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense,Activation,Dropout,Conv2D,MaxPooling2D,Flatten
from keras.optimizers import SGD,Adam

数据处理,具体每一步的作用,也有详细的注释

def load_data():
    (x_train,y_train),(x_test,y_test)=mnist.load_data()
    number=10000
    x_train=x_train[0:number]
    y_train=y_train[0:number]
    x_train=x_train.reshape(x_train.shape[0],28*28)
    x_test=x_test.reshape(x_test.shape[0],28*28)
    #标签值转换成十维向量
    y_train=np_utils.to_categorical(y_train,10)
    y_test=np_utils.to_categorical(y_test,10)
    #归一化
    x_train=x_train.astype('float32')
    x_test=x_test.astype('float32')
    x_train=x_train/255
    x_test=x_test/255
    return (x_train,y_train),(x_test,y_test)
(x_train,y_train),(x_test,y_test)=load_data()

1 sigmoid激发函数+mse+SGD

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='sigmoid'))
for i in range(10):
    model.add(Dense(units=666,activation='sigmoid'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='mse',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

这里就是暴力12层的全连接层,其实效果非常差,和3层基本一样,具体效果可见我的视频,准确率只有可怜的10%,和瞎蒙一样,可以说,完全没有学到东西。
这里介绍一下sigmoid
θ ( x ) = 1 1 + e − x θ(x)=\frac{1}{1+e^{-x}} θ(x)=1+ex1
对,就是上面这玩意,因为映射值强制到了0-1,所以前面的梯度在后面就全都变得很小,然后就训不出来了,大概就是这么个意思吧。
然后是mse,就是均方差
J ( θ ) = 1 2 m ∑ i = 0 m ( y i − h θ ( x i ) ) 2 J(\theta) = \frac{1}{2m}\sum_{i = 0} ^m(y^i - h_\theta (x^i))^2 J(θ)=2m1i=0m(yihθ(xi))2
就是上面这货,用来求差值的
至于SGD呢,就是随机梯度下降算法,对上面的mse进行求导,然后往梯度方向逐渐逼近局部最优值。

2 relu激发函数+mse+SGD

在此,分析上面的问题是后面的梯度因为sigmoid函数的特性,而不能很好的保留前面的梯度值,这里我们修改成relu函数

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='mse',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

这里我只是修改了一下激发函数,然后在3层全连接层的训练精度就达到了惊人的90%
relu函数是什么呢?为什么会有这么出色效果?
a = m a x ( 0 , z ) a=max(0,z) a=max(0,z)
对,就是这个,比上面的simoid函数简单不知多少,然而效果却比前面的不知道好了多少。
那么,我们是否还可以在不使用卷积层的前提下,继续优化呢?

3 relu激发函数+mse+SGD 多层

emmm……我们暴力一发来个12层全连接层试试

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
for i in range(10):
    model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='mse',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

然后我的结果就坏掉了,只有26%,很显然,网络层并不是越多越好,多了反而结果会坏掉。
那么接下来就不行了吗?

4 relu激发函数+mse+Adam 更快收敛

这里介绍一个另一个优化算法,Adam,也就是自适应优化器,和SGD不同的是,这货会自己在程序运行中,自己动态的调节梯度下降的快慢,从而能加快网络的收敛速度

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='mse',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

同样的三层全连接层,然后最后达到了95%的进度,结果可谓喜人。

5 relu激发函数+categorical_crossentropy+SGD

回到上面一个话题,我们有没有办法让上面一个被训练坏掉的12层全连接层的结果变得更好呢?这里我们将mse换成categorical_crossentropy,也就是交叉熵来试试

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
for i in range(10):
    model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

在这样情况下,我们同样发现最后精度达到了95.1%,模型并没有坏掉,那么交叉熵是个什么东西呢?(emmm……我完全不知道原因,只知道这个效果更好……zzzz,咸鱼到了极致)
L = − [ y l o g y ^ + ( 1 − y ) l o g ( 1 − y ^ ) ] L=−[ylog \hat{y}+(1−y)log (1−\hat{y})] L=[ylogy^+(1y)log(1y^)]
这是这货,看起来很复杂,其实就是正常的梯度下降的求导结果加了求对数操作吧……大概。

6 relu激发函数+categorical_crossentropy+Adam

那么我们如果在换上Adam呢?

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
for i in range(10):
    model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

最后验证集精度在95.7%吧 好了一丢丢,不过训练集倒是到了99.5,轻微过拟合吧,我是连不上去了

二、Dropout层对错误数据的拟合效果优化

如果我们一开始训练的数据和最后验证的结果标签值不一致呢?

def load_error_data():
    (x_train,y_train),(x_test,y_test)=mnist.load_data()
    number=10000
    x_train=x_train[0:number]
    y_train=y_train[0:number]
    x_train=x_train.reshape(x_train.shape[0],28*28)
    x_test=x_test.reshape(x_test.shape[0],28*28)
    #标签值转换成十维向量
    y_train=np_utils.to_categorical(y_train,10)
    y_test=np_utils.to_categorical(y_test,10)
    #归一化
    x_train=x_train.astype('float32')
    x_test=x_test.astype('float32')
    
    
    x_train=x_train/255
    x_test=x_test/255
    # 返回的验证集的训练结果和标签值不对应
    x_test=np.random.normal(x_test)
    return (x_train,y_train),(x_test,y_test)
(x_train,y_train),(x_test,y_test)=load_error_data()
model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=666,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
model.summary()
model.compile(loss='mse',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

结果显然是不好的,
Accurancy of Training Set: 0.9895
Accurancy of Training Set: 0.5082

稍作修改 加入Dropout层

model = Sequential()
model.add(Dense(input_dim=28*28,units=666,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=666,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=666,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=10,activation='softmax'))
# model.add(Dropout(0.5))
model.summary()
model.compile(loss='mse',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

最后的结果是:
Accurancy of Training Set: 0.9873
Accurancy of Training Set: 0.6099
效果提升了10%,个人感觉还算不错

三、卷积网络的加入

说了这么多,终于到了我们这篇博客的主角–卷积层。
首先是数据处理

def load_con_data():
    (x_train,y_train),(x_test,y_test)=mnist.load_data()
    number=10000
    x_train=x_train[0:number]
    y_train=y_train[0:number]
    #卷积网络不能是之前的 一维线性结构了
    x_train=x_train.reshape(x_train.shape[0],28,28,1)
    x_test=x_test.reshape(x_test.shape[0],28,28,1)
    #标签值转换成十维向量
    y_train=np_utils.to_categorical(y_train,10)
    y_test=np_utils.to_categorical(y_test,10)
    #归一化
    x_train=x_train.astype('float32')
    x_test=x_test.astype('float32')
    
    
    x_train=x_train/255
    x_test=x_test/255
    
#     x_test=np.random.normal(x_test)
    return (x_train,y_train),(x_test,y_test)
(x_train,y_train),(x_test,y_test)=load_con_data()

然后我们开始搭建网络进行训练

model = Sequential() # 1*28*28
model.add(Conv2D(32,(3,3),input_shape=(28,28,1))) #32*26*26
# model.add(Dropout(0.5))
model.add(Conv2D(64,(3,3))) #64*24*24
# model.add(Dropout(0.5))
# model.add(Conv2D(128,(3,3))) #128*22*22
# model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(units=100,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
# model.add(Dropout(0.5))
model.summary()
model.compile(loss='mse',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

Accurancy of Training Set: 0.9854
Accurancy of Training Set: 0.9308
最后的精度没有我们之前的95%高嘛,别急,我们在加个池化层

model = Sequential() # 1*28*28
model.add(Conv2D(32,(3,3),input_shape=(28,28,1))) #32*26*26
model.add(MaxPooling2D(2,2))
# model.add(Dropout(0.5))
model.add(Conv2D(64,(3,3))) #64*24*24
model.add(MaxPooling2D(2,2))
# model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(units=100,activation='relu'))
model.add(Dense(units=10,activation='softmax'))
# model.add(Dropout(0.5))
model.summary()
model.compile(loss='mse',optimizer='adam',metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
print('Accurancy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
print('Accurancy of Training Set:',score[1])

Accurancy of Training Set: 0.997
Accurancy of Training Set: 0.9777
已经97%,下面我们再来个Dropout层

model = Sequential()
model.add(Conv2D(32,(3,3),input_shape=(28,28,1))) # 9个参数  1*28*28 =》 32*26*26
model.add(MaxPooling2D(2,2))   #=》32*13*13
model.add(Dropout(0.25))
model.add(Conv2D(64,(3,3))) # 225 个参数  =》64*11*11
model.add(MaxPooling2D(2,2)) # =》64*5*5
model.add(Dropout(0.25))
model.add(Flatten()) # => 64*5*5   
model.add(Dense(units=100,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=10,activation='softmax'))
# model.compile(loss='mse',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.summary()
model.compile(loss='categorical_crossentropy',optimizer=SGD(lr=0.1),metrics=['accuracy'])
model.fit(x_train,y_train,batch_size=100,epochs=20)

score=model.evaluate(x_train,y_train)
# print('Total loss on Testing Set:',score[0])
print('Accuracy of Training Set:',score[1])

score=model.evaluate(x_test,y_test)
# print('Total loss on Testing Set:',score[0])
print('Accuracy of Testing Set:',score[1])

Accuracy of Training Set: 0.9947833333333334
Accuracy of Testing Set: 0.9901
至此,我也已经达到了本次我的极限,也许还有其他更好的本法,我们下次再说吧。

那么什么是卷积层呢?就是通过一个卷积核,对整个大矩阵做矩阵乘法。
至于池化层,就是选择矩阵内的最大最小值。最大池化,就是选最大值。

更详细的的介绍,以后有时间补上,这篇博客,就介绍到此。

  • 0
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值