kaggle小白入门——Digit Recognizer 0.997+

在手写数字识别中,初步学会了一步步构建自己的卷积神经网络模型,最后采用多模型投票,获得了不错的分数。本文有大量的训练,我的显卡也不咋地,1050ti,耗费了很多时间。建议无gpu的同学就不要训练了,可以看一看,利用最终模型和我提供的权重文件即可。

 

一、导包

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
from keras import layers
from keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D, Add
from keras.layers import AveragePooling2D, MaxPooling2D, Dropout,GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.models import Model, Sequential
from keras.callbacks import LearningRateScheduler, ReduceLROnPlateau
from keras.initializers import glorot_uniform
from keras.optimizers import RMSprop
import keras.backend as K
K.set_image_data_format("channels_last")
import time
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from keras.preprocessing.image import ImageDataGenerator

二、获取数据

train = pd.read_csv("all/train.csv")
test = pd.read_csv("all/test.csv")
sample_submission = pd.read_csv("all/sample_submission.csv")

三、数据初步处理并看看数据

label = np.array(train.label)
data = np.array(train.drop(["label"], axis=1))

# 归一化与独热编码
X = data.reshape((42000, 28, 28, 1))
X = X / 255
enc = OneHotEncoder()
Y = enc.fit_transform(label.reshape((42000, 1))).toarray()
train_x, val_x, train_y, val_y = train_test_split(X, Y, test_size=0.1, random_state=0)

# 归一化
test_data = np.array(test)
test_data = test_data.reshape((28000, 28, 28, 1))
test_data = test_data / 255

# 任意观察一下数据
index = 25351
plt.imshow(data.reshape((42000, 28, 28))[index])
label[index]

输出:

四、数据增强

对图片执行旋转,左右上下移动等操作,增加数据量。数据增强非常有效。旋转角度等参数可以自行修改,这套参数是大多数人用的版本,但也可以自己发掘对自己模型更有效的参数。

datagen = ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, rotation_range=10, zoom_range=0.1, width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=False, vertical_flip=False)
datagen.fit(train_x)

五、一步步搭建自己的模型

首先选择第一层filters参数大小

model = [0] * 8
for j in range(0, 8):
    model[j] = Sequential()
    model[j].add(Conv2D(2 ** (j + 1), kernel_size=2, activation='relu', input_shape=(28,28,1)))
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print(str(2 ** j) + "个filters:")
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

观察训练输出的交叉验证分数,选择较好的参数。训练结果太长,就不放图了。我得出的结果是32或64,我选择了32继续构建

选择第一层kernel_size参数

model = [0] * 6
for j in range(6):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=2 + j, activation='relu', input_shape=(28,28,1)))
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("过滤器大小:" + str(2 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

3,4,5,6,7效果都不错的样子,我选择了5继续构建。

 

padding参数选择

model = Sequential()
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(Flatten())
model.add(Dense(256, activation="relu"))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
print("---------------------------------------------------------------------------------------------")
print("same:")
start = time.time()
model.fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
end = time.time()
print(str(end - start) + "秒")
print("---------------------------------------------------------------------------------------------")

model = Sequential()
model.add(Conv2D(32, kernel_size=5, padding="valid", activation='relu', input_shape=(28,28,1)))
model.add(Flatten())
model.add(Dense(256, activation="relu"))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
print("---------------------------------------------------------------------------------------------")
print("same:")
start = time.time()
model.fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
end = time.time()
print(str(end - start) + "秒")
print("---------------------------------------------------------------------------------------------")

貌似same好一点,但差别也不大。

 

选择卷积层层数

model = [0] * 5
for j in range(5):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    
    if j > 0:
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    if j > 1:
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    if j > 2:
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    if j > 3:
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("卷积层层数:" + str(1 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

3,4,5皆可选择

 

选择池化层pool_size参数

model = [0] * 5
for j in range(5):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=j+2))
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("池化大小:" + str(2 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

2,3,4都不错的样子。

决定以5层卷积,pool_size=3继续构建

选择添加池化层的方式,从最后一层卷积层开始,向上依次往每层卷积层后加一层池化层。通过训练观察得到的结果是在最后一层卷积层后添加池化层,或者在倒数第一二层卷积层后添加池化层,效果较好。我选择的是在倒数一二层即最后两层添加池化层。

model = Sequential()
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=3, padding="same"))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=3))

model.add(Flatten())
model.add(Dense(256, activation="relu"))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
print("---------------------------------------------------------------------------------------------")
print("最后两层池化:")
start = time.time()
model.fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
end = time.time()
print(str(end - start) + "秒")
print("---------------------------------------------------------------------------------------------")

 

以已经构建的模型,即五层卷积层加最后两层池化为一个模块,选择所需模块数量。

model = [0] * 2
for j in range(2):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    if j > 0:
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
        model[j].add(MaxPooling2D(pool_size=3, padding="same"))
        model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
        model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("模块层数:" + str(1 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=35, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

模块数量为2貌似有效

 

新模块卷积层filter参数选择

model = [0] * 2
for j in range(2):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Conv2D(32 * 2 ** (j + 1), kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32 * 2 ** (j + 1), kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32 * 2 ** (j + 1), kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32 * 2 ** (j + 1), kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32 * 2 ** j, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("新模块filters数:" + str(32 * 2 ** (j + 1)))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=35, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

综合考虑选择64。

 

新模块卷积层kernel_size参数选择

model = [0] * 6
for j in range(6):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Conv2D(64, kernel_size=2 + j, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=2 + j, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=2 + j, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=2 + j, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(64, kernel_size=2 + j, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("新模块过滤器大小:" + str(2 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

感觉3,4,5,6都差不多,继续使用5

 

新模块池化层pool_size参数选择

model = [0] * 5
for j in range(5):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=2 + j, padding="same"))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=2 + j, padding="same"))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("新模块池化大小:" + str(2 + j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

依旧用3

 

以当前模型的两个卷积池化模块,以及最后的全连接模块总共3个模块,选择添加dropout的方式。有每层单独添加,三层全加,任意两层加共7种方式。

model = [0] * 7
for j in range(7):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    if j == 0 or j == 3 or j == 4 or j == 6:
        model[j].add(Dropout(0.1))
    
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    if j == 1 or j == 3 or j == 5 or j == 6:
        model[j].add(Dropout(0.1))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    
    if j == 2 or j == 4 or j == 5 or j == 6:
        model[j].add(Dropout(0.1))
    
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("添加dropout的方式:" + str(j))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=20, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

选择第5种添加方式,即在第二模块和第三模块后添加droupout.

 

选择第一个dropout的参数,dropout参数选择训练中使用循环也许会遇到一些问题,可以手动直接修改参数值训练观察。

model = [0] * 7
for j in range(7):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Dropout(0.1 * (j + 1)))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    
    model[j].add(Dropout(0.1))
    
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("第一个dropout的参数:" + str(0.1 * (j + 1)))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=35, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

0.1,0.4,0.5,0.6,0.7貌似都不错,选择0.4

 

选择第二个dropout的参数

model = [0] * 6
for j in range(6):
    model[j] = Sequential()
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(32, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    model[j].add(Conv2D(64, kernel_size=5, padding="same", activation='relu', input_shape=(28,28,1)))
    model[j].add(MaxPooling2D(pool_size=3, padding="same"))
    
    model[j].add(Dropout(0.4))
    
    model[j].add(Flatten())
    model[j].add(Dense(256, activation="relu"))
    
    model[j].add(Dropout(0.2 + j * 0.1))
    
    model[j].add(Dense(10, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    print("---------------------------------------------------------------------------------------------")
    print("第二个dropout的参数:" + str(0.1 * (j + 2)))
    start = time.time()
    model[j].fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=35, steps_per_epoch= len(train_x) // 64)
    end = time.time()
    print(str(end - start) + "秒")
    print("---------------------------------------------------------------------------------------------")

0.2,0.5比较优秀,选0.5

 

添加batch normalization,同样的,bn层有多种添加方式,可以每个卷积层都加bn,也可以一些加一些不加,我试了一些方式,效果都不错,最终选择了每一个卷积层都添加bn的方式,这也是我最终选用的模型,当然之后还可以进行全连接参数的选择等操作,但我经训练观察发现当前模型已经具有不错的效果了,后续的尝试就不再放了,可以自己去尝试

model = Sequential()
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0), input_shape=(28,28,1)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
    
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
    
model.add(Dropout(0.4))
    
model.add(Flatten())
model.add(Dense(256, activation="relu", kernel_initializer=glorot_uniform(seed=0)))
    
model.add(Dropout(0.5))
    
model.add(Dense(10, activation='softmax', kernel_initializer=glorot_uniform(seed=0)))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
print("---------------------------------------------------------------------------------------------")
print("加BN:")
start = time.time()
model.fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=100, steps_per_epoch= len(train_x) // 64)
end = time.time()
print(str(end - start) + "秒")
print("---------------------------------------------------------------------------------------------")

 

六、模型训练

最终模型

model = Sequential()
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0), input_shape=(28,28,1)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
model.add(Conv2D(32, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
    
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
model.add(Conv2D(64, kernel_size=5, padding="same", activation='relu', kernel_initializer=glorot_uniform(seed=0)))
model.add(BatchNormalization(axis=3))
model.add(MaxPooling2D(pool_size=3, padding="same"))
    
model.add(Dropout(0.4))
    
model.add(Flatten())
model.add(Dense(256, activation="relu", kernel_initializer=glorot_uniform(seed=0)))
    
model.add(Dropout(0.5))
    
model.add(Dense(10, activation='softmax', kernel_initializer=glorot_uniform(seed=0)))

模型编译

model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

训练

print("---------------------------------------------------------------------------------------------")
start = time.time()
model.fit_generator(datagen.flow(train_x, train_y, batch_size=64), validation_data=(val_x, val_y), epochs=80, steps_per_epoch= len(train_x) // 64)
end = time.time()
print(str(end - start) + "秒")
print("---------------------------------------------------------------------------------------------")

预测并保存预测文件

predict = np.argmax(model.predict(test_data), axis=1)

sample_submission["Label"] = predict
sample_submission.to_csv("all/1.csv", index=False)

一般训练80代之后会出现一些能使分数达到0.996+的模型,这些模型可以帮助我们将分数提高到0.997+,我通过1代1代的训练,得到了一些优秀的模型,可以将这些模型的权重保存下来。

保存模型权重与导入模型权重

model.save_weights("my_model_weights_614_1.h5")

model.load_weights("my_model_weights_614_1.h5")

 

模型组合,采用简单投票法,用3个分数0.996+的模型预测进行投票,基本上都能获得0.997+的分数,现在以一个最终分数为0.99728的模型组合为例。

权重可在百度网盘下载,提取码:ltu5 

导入权重与预测

model.load_weights("my_model_weights_600_2.h5")
predict_600_2 = np.argmax(model.predict(test_data), axis=1)

model.load_weights("my_model_weights_628_2.h5")
predict_628_2 = np.argmax(model.predict(test_data), axis=1)

model.load_weights("my_model_weights_642_2.h5")
predict_642_2 = np.argmax(model.predict(test_data), axis=1)

投票函数,没找到那种直接在预测上投票的,不知道有没有,就自己写了个。

def combine_model(predict1, predict2, predict3):
    not_equal_index = np.unique(np.hstack((np.where(predict1 != predict2)[0], np.where(predict1 != predict3)[0], np.where(predict2 != predict3)[0])))
    predict = np.copy(predict1)
    for i in not_equal_index:
        if (predict2[i] == predict3[i]) and (predict2[i] != predict1[i]):
            predict[i] = predict2[i]
    return predict

组合预测

sample_submission["Label"] = combine_model(predict_642_2, predict_600_2, predict_628_2)
sample_submission.to_csv("all/1.csv", index=False)

这样提交能够得到0.99728的分数,应该是top10%了,当然通过组合不同的模型,也许可以达到更高的分数,我通过多种组合,目前最高达到了0.99785的分数,有兴趣的可以继续探索更高的分数。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
要将Kaggle的数字识别(Digit Recognizer)数据转化为`mxnet.gluon.data.dataloader.DataLoader`格式,你可以按照以下步骤进行操作: 1. 导入必要的库: ```python import numpy as np import pandas as pd import mxnet as mx from mxnet.gluon.data import Dataset, DataLoader ``` 2. 加载训练数据集: ```python train_data = pd.read_csv('train.csv') ``` 3. 定义自定义数据集类: ```python class DigitDataset(Dataset): def __init__(self, data, transform=None): self.data = data self.transform = transform def __getitem__(self, idx): image = np.array(self.data.iloc[idx, 1:]).reshape(28, 28, 1).astype(np.float32) / 255.0 label = np.array(self.data.iloc[idx, 0]) if self.transform: image = self.transform(image) return image, label def __len__(self): return len(self.data) ``` 在这个自定义数据集类中,我们将图像数据转化为`np.float32`类型,并将像素值缩放到0-1范围内。你也可以根据需要添加其他的数据转换操作。 4. 创建数据集实例并进行数据转换(如果需要的话): ```python dataset = DigitDataset(train_data) ``` 5. 创建`DataLoader`实例: ```python batch_size = 32 dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True) ``` 在这里,设置了批量大小(`batch_size`),并选择是否对数据进行洗牌(`shuffle=True`)。 现在,你可以使用`dataloader`来迭代访问你的数据集。每次迭代将返回一个批量的图像和标签。 希望这能帮助到你!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值