KNN SVC CNN三种机器学习方法在MNIST上分类任务实现

数据集

MNIST包含各种手写数字图片,共有70000张图片,每张图片是一个28×28像素点的0-9的手写数字图片。

把784个像素点组成一个长度为784的一维数组,作为模型的输入特征。MNIST数据集还提供了每张图片对应的标签,以一个长度为10的一维数组给出。

# 引入库
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.pyplot import MultipleLocator
import numpy as np
import time as t
import warnings
warnings.filterwarnings('ignore')

加载数据

把下载的mnist-original.mat文件放到./dataset/mldata/目录下,对数据进行加载

from sklearn import datasets
from sklearn.datasets import fetch_mldata
# 加载数据
mnist = fetch_mldata('mnist-original', data_home = './datasets/') 
X, y = mnist['data'], mnist['target'] # X:data,y:label
# print(X.shape, y)# 70000 70000

可视化数据集

def plot_digits(instances, images_per_row=20, **options):
    size = 28
    images_per_row = min(len(instances), images_per_row)
    images = [instance.reshape(size,size) for instance in instances]
    n_rows = (len(instances) - 1) // images_per_row + 1
    row_images = []
    n_empty = n_rows * images_per_row - len(instances)
    images.append(np.zeros((size, size * n_empty)))
    for row in range(n_rows):
        rimages = images[row * images_per_row : (row + 1) * images_per_row]
        row_images.append(np.concatenate(rimages, axis=1))
    image = np.concatenate(row_images, axis=0)
    plt.imshow(image, cmap = matplotlib.cm.binary, **options)
    plt.axis("off")

plt.figure(figsize=(16,8))
example_images = np.r_[X[:12000:600], X[13000:30600:600], X[30600:60000:590]]
plot_digits(example_images, images_per_row=20)
# save_fig("more_digits_plot")
plt.show()

在这里插入图片描述

数据预处理

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
#标准化特征
scaler = StandardScaler()

shuffle_index = np.random.permutation(60000)  # 随机排列一个序列,返回一个排列的序列。
X1, y1 = X[shuffle_index[:2000]], y[shuffle_index[:2000]]
# 为了保证学习的效率,只随机取了2000个数据作为训练集进行训练

X_standardized = scaler.fit_transform(X1)

# 随机划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X_standardized, y1, test_size=0.2)
print(X_train.shape, X_test.shape)
(1600, 784) (400, 784)

Method 1 KNN

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import learning_curve
from sklearn import metrics
begin_t = t.time()
#创建一个有5个邻居的KNN分类器对象
knn = KNeighborsClassifier(n_neighbors=5, n_jobs=-1)

#训练模型
model = knn.fit(X_train, y_train)

#预测数据
predictions = model.predict(X_test)

#测试正确率
accuracy = metrics.accuracy_score(y_test, predictions)
print ('accuracy:%.2f%%'%(100*accuracy))
print("Total time: {:.2f}s".format(t.time()-begin_t))
accuracy:84.50%
Total time: 0.54s

进行调参

通过cross_val_score函数来对不同参数k的选取对acc的影响进行观察,并画出曲线。

from sklearn.model_selection import cross_val_score

knn = KNeighborsClassifier
begin_t = t.time()
k_scores = []
krange = range(1,21)
for k in krange:
    scores = cross_val_score(knn(n_neighbors = k),X_train,y_train,cv=10,scoring = 'accuracy')   #分类 十折交叉验证
    k_scores.append(scores.mean())
print("Total time: {:.2f}s".format(t.time()-begin_t))

plt.plot(krange, k_scores)
x_major_locator = MultipleLocator(1)
# 把x轴的刻度间隔设置为1
ax = plt.gca()
# ax为两条坐标轴的实例
ax.xaxis.set_major_locator(x_major_locator)
# 把x轴的主刻度设置为1的倍数
plt.xlabel("Value of K for KNN")
plt.ylabel("Cross Validated Accuracy")
plt.show()
Total time: 82.00s

在这里插入图片描述

可以从上面的曲线看到大致最优的k选取的区间是[3,7],当k越高acc反而下降的原因可能是模型在数据上过拟合了。然后通过GridSearchCV选择KNN最优的n_neighbors参数。

from sklearn.model_selection import GridSearchCV

begin_t = t.time()
#把要调整的参数以及其候选值 列出来;
param_grid = {"n_neighbors":[3,4,5,6,7]}
print("Parameters:{}".format(param_grid))

knn = KNeighborsClassifier
grid_search = GridSearchCV(knn(),param_grid,cv=10) #实例化一个GridSearchCV类
grid_search.fit(X_train,y_train) #训练,找到最优的参数,同时使用最优的参数实例化一个新的estimator

print("Test set score:{:.2f}".format(grid_search.score(X_test,y_test)))
print("Best parameters:{}".format(grid_search.best_params_))
print("Best score on train set:{:.2f}".format(grid_search.best_score_))
print("Total time: {:.2f}s".format(t.time()-begin_t))
Parameters:{'n_neighbors': [3, 4, 5, 6, 7]}
Test set score:0.84
Best parameters:{'n_neighbors': 5}
Best score on train set:0.84
Total time: 179.78s

画出Learning Curve

通过learning_curve返回模型在fit时候的scrore,得到learning curve,比较training data和cross-validation data在不同规模上训练的loss差别。

from sklearn.model_selection import learning_curve 

estimator = knn(n_neighbors=grid_search.best_estimator_.n_neighbors)
t1 = t.time()
train_sizes, train_loss, test_loss = learning_curve( \
                estimator, X_train, y_train, cv=10, scoring="neg_log_loss", \
                train_sizes=[0.1, 0.25, 0.5, 0.75, 1], n_jobs=-1) 
print("fit time: {:.2f}s".format(t.time()-t1))

train_loss_mean = -np.mean(train_loss, axis=1)
test_loss_mean = -np.mean(test_loss, axis=1)

# print(train_sizes, train_loss, test_loss)
plt.plot(train_sizes, train_loss_mean, 'o-', color="r", label="Training")
plt.plot(train_sizes, test_loss_mean, 'o-', color="y", label="Cross-validation")

title = 'Learning Curves (kNN, $n_{neighbors}=%d$)'%grid_search.best_estimator_.n_neighbors 
plt.title(title) 
plt.xlabel("Training examples")
plt.ylabel("Loss")
plt.legend(loc="best")
plt.show()
fit time: 54.55s

在这里插入图片描述

画出Validation Curve

通过validation_curve进行观察,对于不同 n n e i g h b o r s n_{neighbors} nneighbors的取值,画出training data和cross-validation data的loss曲线。

from sklearn.model_selection import validation_curve 

param_range = np.array(range(1,11,1))
# estimator = knn(n_neighbors=grid_search.best_estimator_.n_neighbors)
t1 = t.time()
train_loss, test_loss = validation_curve( \
                            estimator, X_train, y_train, param_name = 'n_neighbors', param_range = param_range, \
                            cv=10, scoring='neg_log_loss', n_jobs=-1)
print("fit time: {:.2f}s".format(t.time()-t1))

train_loss_mean = -np.mean(train_loss, axis=1)
test_loss_mean = -np.mean(test_loss, axis=1)

plt.plot(param_range, train_loss_mean, 'o-', color="r", label="Training")
plt.plot(param_range, test_loss_mean, 'o-', color="y", label="Cross-validation")

title = 'Validation Curves'
plt.title(title) 
plt.xlabel("$n_{neighbors}$")
plt.ylabel("Loss")
plt.legend(loc="best")
plt.show()
fit time: 188.45s

在这里插入图片描述

可以看到在training data和cross-validatioin data上随着 n n e i g h b o r s n_{neighbors} nneighbors的变化Loss变化的程度。由于knn算法是取k个最邻近的点进行label判断,所以在training data上随着临近点 n n e i g h b o r s n_{neighbors} nneighbors的增大Loss会略有增加。但我们根据cross-validation的loss,可以让knn模型在测试集上的表现更好。同时结合前面提到的调参方法,对k的选取能够保证最高的准确率。

Method 2 SVC

#加载库
from sklearn.svm import SVC
begin_t = t.time()
#加载数据
svc = SVC(probability=True)
#训练模型
model = svc.fit(X_train, y_train)

#预测数据
predictions = model.predict(X_test)
#测试正确率
accuracy = metrics.accuracy_score(y_test, predictions)

print ('accuracy:%.2f%%'%(100*accuracy))
print("Total time: {:.2f}s".format(t.time()-begin_t))
accuracy:88.50%
Total time: 11.92s

进行调参

SVC的核函数选用高斯核函数rbf,对其gamma参数进行调参。调用validation_curve函数观察training data和cross-validation data上的loss随gamma不同的变化情况,然后选取合适的gamma值。

from sklearn.model_selection import validation_curve 

param_range = np.logspace(-5,-3,6)

t1 = t.time()
train_loss, test_loss = validation_curve(
    svc, X_train, y_train, param_name = 'gamma', param_range = param_range, cv=10, scoring='neg_log_loss')
print("fit time: {:.2f}s".format(t.time()-t1))

train_loss_mean = -np.mean(train_loss, axis=1)
test_loss_mean = -np.mean(test_loss, axis=1)

plt.plot(param_range, train_loss_mean, 'o-', color="r", label="Training")
plt.plot(param_range, test_loss_mean, 'o-', color="y", label="Cross-validation")

plt.xlabel("Gamma")
plt.ylabel("Loss")
plt.legend(loc="best")
plt.show()
fit time: 994.48s

在这里插入图片描述
用GridSearchCV得到最优的参数gamma。

from sklearn.model_selection import GridSearchCV

begin_t = t.time()
#把要调整的参数以及其候选值 列出来;
param_grid = {"gamma":param_range}

print("Parameters:{}".format(param_grid))

knn = KNeighborsClassifier
grid_search = GridSearchCV(svc,param_grid,cv=10) #实例化一个GridSearchCV类
grid_search.fit(X_train,y_train) #训练,找到最优的参数,同时使用最优的参数实例化一个新的estimator

print("Test set score:{:.2f}".format(grid_search.score(X_test,y_test)))
print("Best parameters:{}".format(grid_search.best_params_))
print("Best score on train set:{:.2f}".format(grid_search.best_score_))
print("Total time: {:.2f}s".format(t.time()-begin_t))
Parameters:{'gamma': array([1.00000000e-05, 2.51188643e-05, 6.30957344e-05, 1.58489319e-04,
       3.98107171e-04, 1.00000000e-03])}
Test set score:0.89
Best parameters:{'gamma': 0.001}
Best score on train set:0.88
Total time: 1022.71s

画出Learning Curve

通过learning_curve返回模型在fit时候的scrore,得到learning curve,比较training data和cross-validation data在不同规模上训练的loss差别。

t1 = t.time()
best_gamma = grid_search.best_estimator_.gamma
svc = SVC(gamma=best_gamma, probability=True)
train_sizes, train_loss, test_loss = learning_curve(
    svc, X_train, y_train, cv=10, scoring='neg_log_loss',
    train_sizes=[0.1, 0.25, 0.5, 0.75, 1])
print("fit time: {:.2f}s".format(t.time()-t1))

train_loss_mean = -np.mean(train_loss, axis=1)
test_loss_mean = -np.mean(test_loss, axis=1)

plt.plot(train_sizes, train_loss_mean, 'o-', color="r", label="Training")
plt.plot(train_sizes, test_loss_mean, 'o-', color="y", label="Cross-validation")

title = 'Learning Curves (SVC, gamma=%s)'%str(best_gamma)
plt.title(title) 
plt.xlabel("Training examples")
plt.ylabel("Loss")
plt.legend(loc="best")
plt.show()
fit time: 228.53s

在这里插入图片描述

Method 3 CNN

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten
from keras.layers.convolutional import Conv2D,MaxPooling2D
from keras.utils import np_utils
from keras.callbacks import EarlyStopping
from keras import optimizers
from keras.callbacks import ReduceLROnPlateau
from keras import backend as K
import loader
t_begin = t.time()

K.set_image_data_format("channels_last")

#设置随机种子
np.random.seed(0)

#图像信息
channels = 1
height = 28
width = 28

#从MNIST数据集中读取数据和目标
(data_train,target_train),(data_test,target_test) = mnist.load_data()
#print(data_train)

#将训练集图像数据转换成特征  shape[0]只输出行数, shape[1]只输出列
data_train = data_train.reshape(data_train.shape[0],height,width,channels)

#将测试集图像数据转换成特征
data_test = data_test.reshape(data_test.shape[0],height,width,channels)

#将像素的强度值收缩到0和1之间
features_train = data_train / 255
features_test = data_test / 255

# 对目标进行one-hot编码
target_train = np_utils.to_categorical(target_train)
target_test = np_utils.to_categorical(target_test)
number_of_classes = target_test.shape[1]

# 启动神经网络
network = Sequential()

#添加有64个过滤器、一个大小为5*5的窗口和ReLU激活函数的卷积层
network.add(Conv2D(filters=64,
                    kernel_size=(5,5),
                    input_shape=(width,height,channels),
                    activation = 'relu'))

#添加一个2*2窗口的最大池化层
network.add(MaxPooling2D(pool_size=(2,2)))
# 添加dropout
network.add(Dropout(0.5))
# 添加一层来压平输入
network.add(Flatten())
# 添加带ReLU激活函数的有128个神经元的全连接层
network.add(Dense(128,activation="relu"))
# 添加Dropout层
network.add(Dropout(0.5))
# 添加使用softmax激活函数的全连接层
network.add(Dense(number_of_classes,activation='softmax'))

optimizer = optimizers.RMSprop(lr=0.001)
# 编译神经网络
network.compile(loss = "categorical_crossentropy",
                    optimizer = optimizer,
                    metrics=["accuracy"])

对定义好的CNN模型,固定learning rate进行学习,画出对应的loss曲线。

# EarlyStop = EarlyStopping(monitor='val_accuracy', patience=3, verbose=1, mode='auto')

# 训练神经网络
history = network.fit(features_train, #特征
                      target_train,	#目标向量
                      epochs = 20,		#epoch的数量
                      batch_size = 1000,#每个批次的观察值数量
                      validation_split = 0.3,
                      # validation_data=(features_test,target_test)
                      # callbacks=[EarlyStop],
                      verbose=1
                      )

loss, accuracy = network.evaluate(features_test, target_test)
print('loss:%.4f accuracy:%.4f' %(loss, accuracy))

epochs = len(history.history['loss'])
plt.plot(range(epochs), history.history['loss'], label='loss')
plt.plot(range(epochs), history.history['val_loss'], label='val_loss')
plt.legend(['Train', 'Validation'], loc='upper right')
plt.title('Loss Curve', fontsize=15)
plt.tick_params(axis='both', labelsize=14)
plt.xlabel('Epoch', fontsize=12)
plt.ylabel('Loss', fontsize=12)
plt.show()

print('Total time: {:.2f}s'.format(t.time()-t_begin))
Epoch 1/20
42/42 [==============================] - 37s 875ms/step - loss: 0.7371 - accuracy: 0.7653 - val_loss: 0.2225 - val_accuracy: 0.9377
Epoch 2/20
42/42 [==============================] - 37s 887ms/step - loss: 0.2614 - accuracy: 0.9229 - val_loss: 0.1435 - val_accuracy: 0.9581
Epoch 3/20
42/42 [==============================] - 36s 856ms/step - loss: 0.1719 - accuracy: 0.9491 - val_loss: 0.1087 - val_accuracy: 0.9669
Epoch 4/20
42/42 [==============================] - 36s 854ms/step - loss: 0.1286 - accuracy: 0.9626 - val_loss: 0.0818 - val_accuracy: 0.9756
Epoch 5/20
42/42 [==============================] - 36s 861ms/step - loss: 0.1031 - accuracy: 0.9695 - val_loss: 0.0863 - val_accuracy: 0.9744
Epoch 6/20
42/42 [==============================] - 36s 862ms/step - loss: 0.0891 - accuracy: 0.9733 - val_loss: 0.0615 - val_accuracy: 0.9819
Epoch 7/20
42/42 [==============================] - 39s 925ms/step - loss: 0.0781 - accuracy: 0.9769 - val_loss: 0.0563 - val_accuracy: 0.9830
Epoch 8/20
42/42 [==============================] - 37s 889ms/step - loss: 0.0694 - accuracy: 0.9786 - val_loss: 0.0528 - val_accuracy: 0.9841
Epoch 9/20
42/42 [==============================] - 38s 912ms/step - loss: 0.0634 - accuracy: 0.9807 - val_loss: 0.0545 - val_accuracy: 0.9839
Epoch 10/20
42/42 [==============================] - 38s 896ms/step - loss: 0.0586 - accuracy: 0.9824 - val_loss: 0.0482 - val_accuracy: 0.9854
Epoch 11/20
42/42 [==============================] - 37s 879ms/step - loss: 0.0534 - accuracy: 0.9841 - val_loss: 0.0456 - val_accuracy: 0.9863
Epoch 12/20
42/42 [==============================] - 37s 886ms/step - loss: 0.0491 - accuracy: 0.9847 - val_loss: 0.0459 - val_accuracy: 0.9855
Epoch 13/20
42/42 [==============================] - 37s 870ms/step - loss: 0.0472 - accuracy: 0.9856 - val_loss: 0.0428 - val_accuracy: 0.9877
Epoch 14/20
42/42 [==============================] - 37s 890ms/step - loss: 0.0433 - accuracy: 0.9868 - val_loss: 0.0443 - val_accuracy: 0.9870
Epoch 15/20
42/42 [==============================] - 37s 893ms/step - loss: 0.0423 - accuracy: 0.9869 - val_loss: 0.0483 - val_accuracy: 0.9858
Epoch 16/20
42/42 [==============================] - 37s 875ms/step - loss: 0.0385 - accuracy: 0.9882 - val_loss: 0.0425 - val_accuracy: 0.9877
Epoch 17/20
42/42 [==============================] - 39s 935ms/step - loss: 0.0372 - accuracy: 0.9881 - val_loss: 0.0442 - val_accuracy: 0.9875
Epoch 18/20
42/42 [==============================] - 39s 927ms/step - loss: 0.0364 - accuracy: 0.9883 - val_loss: 0.0428 - val_accuracy: 0.9874
Epoch 19/20
42/42 [==============================] - 39s 921ms/step - loss: 0.0348 - accuracy: 0.9891 - val_loss: 0.0426 - val_accuracy: 0.9876
Epoch 20/20
42/42 [==============================] - 40s 943ms/step - loss: 0.0345 - accuracy: 0.9890 - val_loss: 0.0445 - val_accuracy: 0.9877
313/313 [==============================] - 3s 11ms/step - loss: 0.0332 - accuracy: 0.9892
loss:0.0332 accuracy:0.9892

在这里插入图片描述

Total time: 776.91s

调整学习率

定义学习率之后,经过一定epoch的迭代,模型效果不再提升,该学习率可能已经不再适应该模型。需要在训练过程中缩小学习率,进而提升模型。使用keras中的回调函数ReduceLROnPlateau,与EarlyStopping配合使用。初始的学习率过小,会需要非常多次的迭代才能使模型达到最优状态,训练缓慢。如果训练过程中不断缩小学习率,可以快速又精确的获得最优模型。

# EarlyStop = EarlyStopping(monitor='val_accuracy', patience=2, verbose=1, mode='auto')
#减小学习率
Reduce = ReduceLROnPlateau(monitor='val_accuracy',
                         factor=0.1,
                         patience=1,
                         verbose=1,
                         mode='auto',
                         min_delta=0.0001,
                         cooldown=0,
                         min_lr=0)

# 训练神经网络
history = network.fit(features_train, #特征
                      target_train,	#目标向量
                      epochs = 20,		#epoch的数量
                      batch_size = 1000,#每个批次的观察值数量
                      validation_split = 0.3,
                      # validation_data=(features_test,target_test)
                      callbacks=[Reduce],
                      verbose=1
                      )

loss, accuracy = network.evaluate(features_test, target_test)
print('loss:%.4f accuracy:%.4f' %(loss, accuracy))

epochs = len(history.history['loss'])
plt.plot(range(epochs), history.history['loss'], label='loss')
plt.plot(range(epochs), history.history['val_loss'], label='val_loss')
plt.legend(['Train', 'Validation'], loc='upper right')
plt.title('Loss Curve', fontsize=15)
plt.tick_params(axis='both', labelsize=14)
plt.xlabel('Epoch', fontsize=12)
plt.ylabel('Loss', fontsize=12)
plt.show()

print('Total time: {:.2f}s'.format(t.time()-t_begin))
Epoch 1/20
42/42 [==============================] - 36s 853ms/step - loss: 0.6979 - accuracy: 0.7809 - val_loss: 0.2200 - val_accuracy: 0.9364 - lr: 0.0010
Epoch 2/20
42/42 [==============================] - 46s 1s/step - loss: 0.2441 - accuracy: 0.9275 - val_loss: 0.1222 - val_accuracy: 0.9648 - lr: 0.0010
Epoch 3/20
42/42 [==============================] - 43s 1s/step - loss: 0.1574 - accuracy: 0.9543 - val_loss: 0.0926 - val_accuracy: 0.9719 - lr: 0.0010
Epoch 4/20
42/42 [==============================] - 49s 1s/step - loss: 0.1189 - accuracy: 0.9650 - val_loss: 0.0788 - val_accuracy: 0.9749 - lr: 0.0010
Epoch 5/20
42/42 [==============================] - 45s 1s/step - loss: 0.0991 - accuracy: 0.9706 - val_loss: 0.0652 - val_accuracy: 0.9807 - lr: 0.0010
Epoch 6/20
42/42 [==============================] - 49s 1s/step - loss: 0.0816 - accuracy: 0.9751 - val_loss: 0.0583 - val_accuracy: 0.9827 - lr: 0.0010
Epoch 7/20
42/42 [==============================] - 45s 1s/step - loss: 0.0742 - accuracy: 0.9771 - val_loss: 0.0567 - val_accuracy: 0.9829 - lr: 0.0010
Epoch 8/20
42/42 [==============================] - 39s 937ms/step - loss: 0.0676 - accuracy: 0.9807 - val_loss: 0.0522 - val_accuracy: 0.9843 - lr: 0.0010
Epoch 9/20
42/42 [==============================] - 38s 904ms/step - loss: 0.0611 - accuracy: 0.9811 - val_loss: 0.0483 - val_accuracy: 0.9854 - lr: 0.0010
Epoch 10/20
42/42 [==============================] - 36s 847ms/step - loss: 0.0563 - accuracy: 0.9827 - val_loss: 0.0479 - val_accuracy: 0.9860 - lr: 0.0010
Epoch 11/20
42/42 [==============================] - ETA: 0s - loss: 0.0509 - accuracy: 0.9840
Epoch 00011: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.
42/42 [==============================] - 37s 880ms/step - loss: 0.0509 - accuracy: 0.9840 - val_loss: 0.0470 - val_accuracy: 0.9860 - lr: 0.0010
Epoch 12/20
42/42 [==============================] - 37s 870ms/step - loss: 0.0441 - accuracy: 0.9860 - val_loss: 0.0454 - val_accuracy: 0.9865 - lr: 1.0000e-04
Epoch 13/20
42/42 [==============================] - 39s 930ms/step - loss: 0.0417 - accuracy: 0.9876 - val_loss: 0.0449 - val_accuracy: 0.9869 - lr: 1.0000e-04
Epoch 14/20
42/42 [==============================] - ETA: 0s - loss: 0.0405 - accuracy: 0.9878
Epoch 00014: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.
42/42 [==============================] - 42s 991ms/step - loss: 0.0405 - accuracy: 0.9878 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-04
Epoch 15/20
42/42 [==============================] - ETA: 0s - loss: 0.0398 - accuracy: 0.9876
Epoch 00015: ReduceLROnPlateau reducing learning rate to 1.0000000656873453e-06.
42/42 [==============================] - 38s 907ms/step - loss: 0.0398 - accuracy: 0.9876 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-05
Epoch 16/20
42/42 [==============================] - ETA: 0s - loss: 0.0408 - accuracy: 0.9873
Epoch 00016: ReduceLROnPlateau reducing learning rate to 1.0000001111620805e-07.
42/42 [==============================] - 36s 858ms/step - loss: 0.0408 - accuracy: 0.9873 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-06
Epoch 17/20
42/42 [==============================] - ETA: 0s - loss: 0.0407 - accuracy: 0.9873
Epoch 00017: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-08.
42/42 [==============================] - 36s 861ms/step - loss: 0.0407 - accuracy: 0.9873 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-07
Epoch 18/20
42/42 [==============================] - ETA: 0s - loss: 0.0399 - accuracy: 0.9873
Epoch 00018: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-09.
42/42 [==============================] - 36s 855ms/step - loss: 0.0399 - accuracy: 0.9873 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-08
Epoch 19/20
42/42 [==============================] - ETA: 0s - loss: 0.0412 - accuracy: 0.9880
Epoch 00019: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-10.
42/42 [==============================] - 37s 876ms/step - loss: 0.0412 - accuracy: 0.9880 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-09
Epoch 20/20
42/42 [==============================] - ETA: 0s - loss: 0.0394 - accuracy: 0.9881
Epoch 00020: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-11.
42/42 [==============================] - 36s 857ms/step - loss: 0.0394 - accuracy: 0.9881 - val_loss: 0.0447 - val_accuracy: 0.9868 - lr: 1.0000e-10
313/313 [==============================] - 3s 10ms/step - loss: 0.0323 - accuracy: 0.9896
loss:0.0323 accuracy:0.9896

在这里插入图片描述

Total time: 831.15s

画出上面训练过程中根据val_accuracy动态减小学习率的曲线。

# lrate = len(history.history['lr'])
plt.plot(range(epochs), history.history['lr'])
plt.title('Learning Rate Curve', fontsize=15)
plt.tick_params(axis='both', labelsize=14)
plt.xlabel('Epoch', fontsize=12)
plt.ylabel('Learning Rate', fontsize=12)
plt.show()

在这里插入图片描述

对比分析总结

对MNIST手写数字数据集上的三种实验方法KNN、SVM、CNN如上所示。可以看到神经网络结构在此任务上的优秀表现。然而传统机器学习方法也有其优点和好处,三种方法的对比分析如下:

  • knn没有训练过程,其基本原理就是找到训练数据集里面离需要预测的样本点距离最近的k个值,然后把这k个点的label做个投票,选出一个label做为预测。对于KNN,没有训练过程。只是将训练数据与训练数据进行距离度量来实现分类。

  • svm需要超平面wx+b来分割数据集,因此会有一个模型训练过程来找到w和b的值。训练完成之后可以用来预测,根据函数y=wx+b的值来确定样本点x的label,不需要再考虑训练集。对于SVM,是先在训练集上训练一个模型,然后用这个模型直接对测试集进行分类。

  • knn没有训练过程,但是预测过程需要挨个计算每个训练样本和测试样本的距离,当训练集和测试集很大时,预测效率较低。svm有一个训练过程,训练完直接得到超平面函数,根据超平面函数直接判定预测点的label,预测效率更高。同时两者调参过程不一样。 knn只有一个参数k,而svm的参数更多,在线性不可分的情况下(这种情况更普遍),有松弛变量的系数,有具体的核函数。

  • 不同于传统机器学习方法,CNN是神经网络结构的一种,常被用来处理图像任务。其一般包含输入层(Input layer)、卷积层(convolutional layer)、池化层(pooling layer)和输出层(全连接层+softmax layer)。通过不同层的神经网络结构对图像特征进行提取后进行分类,可以看到能够达到很高的准确率。CNN网络中权重共享策略减少了需要训练的参数,相同的权值可以让滤波器不受信号位置的影响来检测信号的特性,使得训练出来的模型的泛化能力更强。同时池化运算可以降低网络的空间分辨率,消除信号的微小偏移和扭曲,从而对输入数据的平移不变性要求不高。但是作为深度模型的一种,不可避免的会出现梯度消失的问题。随着模型结构的进一步加深,在某些问题上可能不如传统机器学习方法的表现。

  • 3
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值