在本文中,我们首先通过模型的准确性和损失函数来观察卷积神经网络的训练性能。接下来,我们通过提取两个隐藏卷积层的权重和激活来深入学习可视化。主要学习以下内容:
- 可视化convnet训练历史
- 可视化权重
- 可视化激活
0.导入相关库
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
from keras.wrappers.scikit_learn import KerasClassifier
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scikitplot as skplt
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
1.加载数据集以及预处理
num_classes = 2
img_rows, img_cols = 28, 28
(x_train, y_train), (x_test, y_test) = mnist.load_data()
train_picks = np.logical_or(y_train == 2, y_train == 7)
test_picks = np.logical_or(y_test == 2, y_test == 7)
x_train = x_train[train_picks]
x_test = x_test[test_picks]
y_train = np.array(y_train[train_picks] == 7, dtype=int)
y_test = np.array(y_test[test_picks] == 7, dtype=int)
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# x_train shape: (12223, 28, 28, 1)
# 12223 train samples
# 2060 test samples
2.创建神经网络以及训练
为了便于说明,我只对1000个样本进行训练,以节省时间。
model = Sequential()
model.add(Conv2D(32, [3, 3], padding='same', input_shape=input_shape))
model.add(Activation("relu"))
model.add(Conv2D(64, [3, 3], padding='same'))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
hist = model.fit(x_train[:1000], y_train[:1000], batch_size=64, epochs=10, verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=1)
print()
print('Test score:', score[0])
print('Test accuracy:', score[1])
# Test score: 0.06067337468266487
# Test accuracy: 0.9830096960067749
3. 可视化网络训练历史
3.1 可视化模型准确率
通过可视化模型随着epoch增加的准确性,我们可以更好地理解训练的表现,并确定我们需要多少epoch。随着epoch的增加,训练集和测试集的模型精度都增加。虽然在接近尾声的时候会有一些波动(可能是由于使用了较小的数据集),但模型的精度在接近尾声的时候开始趋于饱和,因此我们的模型在epoch =10时应该是好的。
# 列出history的所有keys
print(hist.history.keys())
# dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
# 可视化accuracy
plt.plot(hist.history['accuracy'])
plt.plot(hist.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
3.2 可视化模型损失函数
更仔细地观察随着epoch增加的损失函数,以检查是否有任何异常行为,这也是有用的。对于训练集和测试集,损失函数都应该减小。
plt.plot(hist.history['loss'])
plt.plot(hist.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
4.可视化权重
4.1 可视化第一个卷积层的权重
我们首先提取第一卷积层的权值。由于我们在第一卷积层中使用3x3卷积核和32个神经元,所以有32个滤波器,每个滤波器都有3x3矩阵。然后我们将权值绘制为3x3像素图像,像素越亮表示权值越高。
# 得到第一个卷积层的权重,并可视化32个过滤器
W1 = model.layers[0].get_weights()[0]
W1 = model.layers[0].get_weights()[0][:,:,0,:]
print(type(W1), W1.shape)
print()
print("Filter 1 in the first convolution layer : ")
print(W1[:,:,0])
# <class 'numpy.ndarray'> (3, 3, 32)
# Filter 1 in the first convolution layer :
# [[ 0.01086454 0.056543 0.14941093]
# [-0.14638866 -0.12262134 -0.14196308]
# [ 0.13865776 0.04602183 -0.05038093]]
plt.figure(1, figsize=(15,8))
for i in range(0,32):
plt.subplot(4,10,i+1)
plt.title('Filter ' + str(i+1))
plt.imshow(W1[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
4.2 可视化第二层卷积层的权重
同样,我们提取第二卷积层的权值,并绘制权值。我们在第二卷积层有64个神经元,输入通道是32(因为是第一卷积层的输出),因此有64个滤波器,每个滤波器都有一个3x3x32权重矩阵。我们将权重绘制为9x32像素的图像,这样我们就有了一个2D可视化,更亮的像素表示更高的权重值。
#获取第二个卷积层的权重,并可视化64个过滤器
W2 = model.layers[2].get_weights()[0]
W2 = W2.reshape(9,32,64)
print(type(W2), W2.shape)
print()
print("Filter 1 in the second convolution layer : ")
print(W2[:,:,0]) # (9,32)
# <class 'numpy.ndarray'> (9, 32, 64)
#
# Filter 1 in the second convolution layer :
# [[-0.01091811 -0.03060709 -0.03978376 -0.04080337 0.06473621 -0.05782736
# -0.0258076 0.01918847 -0.01420757 -0.04532713 -0.04676169 -0.00401624
# 0.04570166 -0.06421223 -0.02138959 0.00525405 -0.05001634 0.00086733
# 0.00674382 -0.01507682 0.0439934 -0.0566798 0.01208968 0.03013882
# 0.01585707 -0.03139614 -0.04916877 -0.003985 0.08380383 0.05887256
# 0.0579341 -0.00191117]
# [ 0.02193947 0.04070659 0.03027556 0.04897542 0.06259128 0.05289488
# -0.05446929 0.01715666 -0.0108433 0.01229278 -0.0538603 0.02770079
# -0.07649884 0.00221939 -0.03484974 0.09049256 -0.00947709 0.04822715
# 0.0642111 0.08096531 -0.0737703 0.01727819 0.05036381 0.04525996
# 0.00721971 -0.0017575 -0.05297728 -0.00443828 -0.04575218 0.04304853
# 0.0147502 0.05128239]
# [-0.01307715 0.00482483 -0.01247476 -0.04712921 0.00793466 0.02343135
# 0.04592272 -0.04308037 0.05638364 0.05473799 -0.047027 -0.08546065
# -0.0729535 -0.08467353 0.00503674 -0.03105897 0.00915634 0.00920174
# 0.05560324 -0.04362411 -0.06680793 0.043088 0.06740589 -0.00736665
# -0.03958222 0.04922237 0.03780904 -0.04567688 -0.03965345 -0.08939651
# -0.05760688 0.06449086]
# [ 0.06506912 0.00680344 -0.06401524 0.0806345 -0.01795834 -0.01566656
# -0.07478429 0.01886238 -0.04745758 -0.07382288 -0.04196274 0.02578226
# 0.07069031 0.03673461 -0.0367975 -0.03254769 -0.06679455 0.0033801
# 0.02889376 -0.05905522 0.02851398 -0.07045183 0.02615088 0.05586142
# 0.00186886 0.02112191 -0.03998655 0.00761757 -0.00883369 0.03426124
# 0.05703025 -0.06387127]
# [-0.02328891 0.08264955 -0.01879905 0.07405918 0.08385325 -0.00079525
# -0.0467049 -0.05124512 0.05491181 -0.03017477 0.02054235 0.041858
# 0.08942308 0.06560905 -0.03925848 -0.07331402 -0.06061706 0.00735451
# -0.00468104 -0.01752389 -0.03330135 0.04352342 -0.08125249 0.03029426
# 0.0566676 -0.05048853 -0.05054254 -0.01483525 0.09817632 -0.07307149
# 0.08647165 -0.0054682 ]
# [-0.03623843 0.02410195 -0.00040511 -0.05871908 0.03774738 -0.02915179
# -0.00641803 0.03069031 -0.03851721 0.06042026 -0.00113082 -0.02618534
# -0.0480439 -0.00372448 0.05602724 0.03090581 0.04881529 0.00957235
# -0.06921685 -0.0815805 0.06296474 0.06820709 -0.01987579 0.06897794
# -0.04817679 0.02222943 0.02991818 -0.01954529 0.02592529 -0.03616534
# -0.03498475 0.07600925]
# [ 0.07573617 0.02968863 -0.07077129 0.05345889 0.08573991 0.03679356
# 0.00131001 -0.08831469 -0.04378701 -0.053577 -0.01884281 -0.05064945
# 0.07720281 -0.02426615 -0.01522027 -0.02861699 -0.03779349 0.05025636
# -0.01406364 0.00057808 -0.07584406 0.08269251 0.07782793 0.07169417
# -0.05643699 0.0045524 0.04332888 -0.03659601 0.06403161 -0.01810314
# -0.00299348 0.05756217]
# [ 0.0636657 0.00866595 -0.01132362 0.0278302 0.06974484 0.03430951
# 0.01282684 -0.05145631 -0.01134195 0.05120582 -0.07760315 -0.0711544
# -0.04838275 -0.06641025 -0.0250096 0.01380729 -0.01927544 0.02004548
# 0.08112732 -0.04040825 0.06880209 -0.01973099 0.06739783 -0.07032029
# 0.01435619 0.07537328 -0.01236107 -0.01822355 0.01344493 0.06719773
# -0.00017048 0.06597216]
# [-0.05051003 -0.0024246 0.01049676 -0.06073973 0.08129401 -0.06479709
# 0.01883736 -0.07983018 -0.00760706 -0.00420313 -0.03179368 -0.0431091
# 0.07800876 -0.0925895 0.05004843 -0.06873198 -0.02518041 -0.01574085
# -0.07852727 -0.04268203 -0.04270604 0.04083467 0.0062125 -0.06441654
# -0.00192929 -0.02497609 0.02797592 -0.08972529 0.07900523 -0.00641907
# 0.08314665 0.03395193]]
plt.figure(1, figsize=(15,8))
for i in range(0,64):
plt.subplot(7,10,i+1)
plt.title('Filter ' + str(i+1))
plt.imshow(W2[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
5.可视化激活
让我们更深入地观察每个卷积层的激活输出。
5.1可视化第一卷积层的激活
为了提取深层神经网络中中间层的输出,我们可以简单地建立一个新的模型,将我们训练的模型的激活截断到期望层之后。
我们将看到“7”的测试样本在通过第一个卷积层的32个过滤器时的样子。
#提取第一个卷积层的输出,并从32个过滤器中绘制图像
model2 = Sequential()
model2.add(Conv2D(32, [3, 3], border_mode='same', weights=model.layers[0].get_weights(), input_shape=input_shape))
model2.add(Activation("relu"))
x_rep = model2.predict(x_test[0:10]) # 获得前十个样本的表示
x_rep_1 = x_rep[0,:,:,:] # 只查看第一个样本
print(x_rep_1.shape)
# (28, 28, 32)
plt.figure(1, figsize=(15,8))
for i in range(0,32):
plt.subplot(4,10,i+1)
plt.title('Filter ' + str(i+1))
plt.imshow(x_rep_1[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
5.2可视化第二卷积层的激活
我们可以看到,不同的滤波器会在“7”的不同部分产生明亮的像素。这说明每个滤波器都学习对“7”的某一部分进行特征提取。
最后,让我们看看“7”的测试样本在通过第二个卷积层的64个滤波器时是什么样子的。
#提取第二卷积层的输出,并从64个滤波器中绘制图像
model3 = Sequential()
model3.add(Conv2D(32, [3, 3], border_mode='same', weights=model.layers[0].get_weights(), input_shape=input_shape))
model3.add(Activation("relu"))
model3.add(Conv2D(64, [3, 3], border_mode='same', weights=model.layers[2].get_weights()))
model3.add(Activation("relu"))
x_rep = model3.predict(x_test[0:10]) # 获得前十个样本的表示
x_rep_1 = x_rep[0,:,:,:] # 只查看第一个样本
print(x_rep_1.shape)
# (28, 28, 64)
plt.figure(1, figsize=(15,8))
for i in range(0,64):
plt.subplot(7,10,i+1)
plt.title('Filter ' + str(i+1))
plt.imshow(x_rep_1[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
参考目录
https://github.com/sukilau/Ziff-deep-learning/blob/master/1-MNIST/mnist-visualization.ipynb