题图来自:toyota.csail.mit.edu
本文主要对卷积神经网络做可视化分析。01 - 简单线性模型 | 02 - 卷积神经网络 | 03 - PrettyTensor | 04 - 保存& 恢复
05 - 集成学习 | 06 - CIFAR 10 | 07 - Inception 模型 | 08 - 迁移学习
09 - 视频数据 | 11 - 对抗样本 | 12 - MNIST的对抗噪声
by Magnus Erik Hvass Pedersen / GitHub / Videos on YouTube
中文翻译 thrillerist / Github
如有转载,请附上本文链接。
介绍
在之前的一些关于卷积神经网络的教程中,我们展示了卷积滤波权重,比如教程#02和#06。但单从滤波权重上看,不可能确定卷积滤波器能从输入图像中识别出什么。
本教程中,我们会提出一种用于可视化分析神经网络内部工作原理的基本方法。这个方法就是生成最大化神经网络内个体特征的图像。图像用一些随机噪声初始化,然后用给定特征关于输入图像的梯度来逐渐改变(生成的)图像。
可视化分析神经网络的方法也称为 特征最大化(feature maximization) 或 激活最大化(activation maximization)**。
本文基于之前的教程。你需要大概地熟悉神经网络(详见教程 #01和 #02),了解Inception模型也很有帮助(教程 #07)。
流程图
这里将会使用教程 #07中的Inception模型。我们想要找到使得神经网络内给定特征最大化的图像。输入图像用一些噪声初始化,然后用给定特征的梯度来更新图像。在执行了一些优化迭代之后,我们会得到一个这个特定特征“喜欢看到的”图像。
由于Inception模型是由很多相结合的基本数学运算构造的,使用微分链式法则,TensorFlow让我们很快就能找到损失函数的梯度。
from IPython.display import Image, display
Image('images/13_visual_analysis_flowchart.png')复制代码
导入
%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
# Functions and classes for loading and using the Inception model.
import inception复制代码
使用Python3.5.2(Anaconda)开发,TensorFlow版本是:
tf.__version__复制代码
'1.1.0'
Inception 模型
从网上下载Inception模型
从网上下载Inception模型。这是你保存数据文件的默认文件夹。如果文件夹不存在就自动创建。
# inception.data_dir = 'inception/'复制代码
如果文件夹中不存在Inception模型,就自动下载。 它有85MB。
inception.maybe_download()复制代码
Downloading Inception v3 Model ...
Download progress: 100.0%
Download finished. Extracting files.
Done.
卷积层的名称
这个函数返回Inception模型中卷积层的名称列表。
def get_conv_layer_names():
# Load the Inception model.
model = inception.Inception()
# Create a list of names for the operations in the graph
# for the Inception model where the operator-type is 'Conv2D'.
names = [op.name for op in model.graph.get_operations() if op.type=='Conv2D']
# Close the TensorFlow session inside the model-object.
model.close()
return names复制代码
conv_names = get_conv_layer_names()复制代码
在Inception模型中总共有94个卷积层。
len(conv_names)复制代码
94
写出头5个卷积层的名称。
conv_names[:5]复制代码
['conv/Conv2D',
'conv_1/Conv2D',
'conv_2/Conv2D',
'conv_3/Conv2D',
'conv_4/Conv2D']
写出最后5个卷积层的名称。
conv_names[-5:]复制代码
['mixed_10/tower_1/conv/Conv2D',
'mixed_10/tower_1/conv_1/Conv2D',
'mixed_10/tower_1/mixed/conv/Conv2D',
'mixed_10/tower_1/mixed/conv_1/Conv2D',
'mixed_10/tower_2/conv/Conv2D']
找到输入图像的帮助函数
这个函数用来寻找使网络内给定特征最大化的输入图像。它本质上是用梯度法来进行优化。图像用小的随机值初始化,然后用给定特征关于输入图像的梯度来逐步更新。
def optimize_image(conv_id=None, feature=0,
num_iterations=30, show_progress=True):
"""
Find an image that maximizes the feature
given by the conv_id and feature number.
Parameters:
conv_id: Integer identifying the convolutional layer to
maximize. It is an index into conv_names.
If None then use the last fully-connected layer
before the softmax output.
feature: Index into the layer for the feature to maximize.
num_iteration: Number of optimization iterations to perform.
show_progress: Boolean whether to show the progress.
"""
# Load the Inception model. This is done for each call of
# this function because we will add a lot to the graph
# which will cause the graph to grow and eventually the
# computer will run out of memory.
model = inception.Inception()
# Reference to the tensor that takes the raw input image.
resized_image = model.resized_image
# Reference to the tensor for the predicted classes.
# This is the output of the final layer's softmax classifier.
y_pred = model.y_pred
# Create the loss-function that must be maximized.
if conv_id is None:
# If we want to maximize a feature on the last layer,
# then we use the fully-connected layer prior to the
# softmax-classifier. The feature no. is the class-number
# and must be an integer between 1 and 1000.
# The loss-function is just the value of that feature.
loss = model.y_logits[0, feature]
else:
# If instead we want to maximize a feature of a
# convolutional layer inside the neural network.
# Get the name of the convolutional operator.
conv_name = conv_names[conv_id]
# Get a reference to the tensor that is output by the
# operator. Note that ":0" is added to the name for this.
tensor = model.graph.get_tensor_by_name(conv_name + ":0")
# Set the Inception model's graph as the default
# so we can add an operator to it.
with model.graph.as_default():
# The loss-function is the average of all the
# tensor-values for the given feature. This
# ensures that we generate the whole input image.
# You can try and modify this so it only uses
# a part of the tensor.
loss = tf.reduce_mean(tensor[:,:,:,feature])
# Get the gradient for the loss-function with regard to
# the resized input image. This creates a mathematical
# function for calculating the gradient.
gradient = tf.gradients(loss, resized_image)
# Create a TensorFlow session so we can run the graph.
session = tf.Session(graph=model.graph)
# Generate a random image of the same size as the raw input.
# Each pixel is a small random value between 128 and 129,
# which is about the middle of the colour-range.
image_shape = resized_image.get_shape()
image = np.random.uniform(size=image_shape) + 128.0
# Perform a number of optimization iterations to find
# the image that maximizes the loss-function.
for i in range(num_iterations):
# Create a feed-dict. This feeds the image to the
# tensor in the graph that holds the resized image, because
# this is the final stage for inputting raw image data.
feed_dict = {model.tensor_name_resized_image: image}
# Calculate the predicted class-scores,
# as well as the gradient and the loss-value.
pred, grad, loss_value = session.run([y_pred, gradient, loss],
feed_dict=feed_dict)
# Squeeze the dimensionality for the gradient-array.
grad = np.array(grad).squeeze()
# The gradient now tells us how much we need to change the
# input image in order to maximize the given feature.
# Calculate the step-size for updating the image.
# This step-size was found to give fast convergence.
# The addition of 1e-8 is to protect from div-by-zero.
step_size = 1.0 / (grad.std() + 1e-8)
# Update the image by adding the scaled gradient
# This is called gradient ascent.
image += step_size * grad
# Ensure all pixel-values in the image are between 0 and 255.
image = np.clip(image, 0.0, 255.0)
if show_progress:
print("Iteration:", i)
# Convert the predicted class-scores to a one-dim array.
pred = np.squeeze(pred)
# The predicted class for the Inception model.
pred_cls = np.argmax(pred)
# Name of the predicted class.
cls_name = model.name_lookup.cls_to_name(pred_cls,
only_first_name=True)
# The score (probability) for the predicted class.
cls_score = pred[pred_cls]
# Print the predicted score etc.
msg = "Predicted class-name: {0} (#{1}), score: {2:>7.2%}"
print(msg.format(cls_name, pred_cls, cls_score))
# Print statistics for the gradient.
msg = "Gradient min: {0:>9.6f}, max: {1:>9.6f}, stepsize: {2:>9.2f}"
print(msg.format(grad.min(), grad.max(), step_size))
# Print the loss-value.
print("Loss:", loss_value)
# Newline.
print()
# Close the TensorFlow session inside the model-object.
model.close()
return image.squeeze()复制代码
绘制图像和噪声的帮助函数
函数对图像做归一化,则像素值在0.0到1.0之间。
def normalize_image(x):
# Get the min and max values for all pixels in the input.
x_min = x.min()
x_max = x.max()
# Normalize so all values are between 0.0 and 1.0
x_norm = (x - x_min) / (x_max - x_min)
return x_norm复制代码
这个函数绘制一张图像。
def plot_image(image):
# Normalize the image so pixels are between 0.0 and 1.0
img_norm = normalize_image(image)
# Plot the image.
plt.imshow(img_norm, interpolation='nearest')
plt.show()复制代码
这个函数在坐标系内绘制6张图。
def plot_images(images, show_size=100):
"""
The show_size is the number of pixels to show for each image.
The max value is 299.
"""
# Create figure with sub-plots.
fig, axes = plt.subplots(2, 3)
# Adjust vertical spacing.
fig.subplots_adjust(hspace=0.1, wspace=0.1)
# Use interpolation to smooth pixels?
smooth = True
# Interpolation type.
if smooth:
interpolation = 'spline16'
else:
interpolation = 'nearest'
# For each entry in the grid.
for i, ax in enumerate(axes.flat):
# Get the i'th image and only use the desired pixels.
img = images[i, 0:show_size, 0:show_size, :]
# Normalize the image so its pixels are between 0.0 and 1.0
img_norm = normalize_image(img)
# Plot the image.
ax.imshow(img_norm, interpolation=interpolation)
# Remove ticks.
ax.set_xticks([])
ax.set_yticks([])
# Ensure the plot is shown correctly with multiple plots
# in a single Notebook cell.
plt.show()复制代码
优化和绘制图像的帮助函数
这个函数优化多张图像并绘制它们。
def optimize_images(conv_id=None, num_iterations=30, show_size=100):
"""
Find 6 images that maximize the 6 first features in the layer
given by the conv_id.
Parameters:
conv_id: Integer identifying the convolutional layer to
maximize. It is an index into conv_names.
If None then use the last layer before the softmax output.
num_iterations: Number of optimization iterations to perform.
show_size: Number of pixels to show for each image. Max 299.
"""
# Which layer are we using?
if conv_id is None:
print("Final fully-connected layer before softmax.")
else:
print("Layer:", conv_names[conv_id])
# Initialize the array of images.
images = []
# For each feature do the following. Note that the
# last fully-connected layer only supports numbers
# between 1 and 1000, while the convolutional layers
# support numbers between 0 and some other number.
# So we just use the numbers between 1 and 7.
for feature in range(1,7):
print("Optimizing image for feature no.", feature)
# Find the image that maximizes the given feature
# for the network layer identified by conv_id (or None).
image = optimize_image(conv_id=conv_id, feature=feature,
show_progress=False,
num_iterations=num_iterations)
# Squeeze the dim of the array.
image = image.squeeze()
# Append to the list of images.
images.append(image)
# Convert to numpy-array so we can index all dimensions easily.
images = np.array(images)
# Plot the images.
plot_images(images=images, show_size=show_size)复制代码
结果
为浅处的卷积层优化图像
举个例子,寻找让卷积层conv_names[conv_id]
中的2号特征最大化的输入图像,其中conv_id=5
。
image = optimize_image(conv_id=5, feature=2,
num_iterations=30, show_progress=True)复制代码
Iteration: 0
Predicted class-name: dishwasher (#667), score: 4.81%
Gradient min: -0.000083, max: 0.000100, stepsize: 76290.32
Loss: 4.83793Iteration: 1
Predicted class-name: kite (#397), score: 15.12%
Gradient min: -0.000142, max: 0.000126, stepsize: 71463.42
Loss: 5.59611Iteration: 2
Predicted class-name: wall clock (#524), score: 6.85%
Gradient min: -0.000119, max: 0.000121, stepsize: 80427.39
Loss: 6.91725...
Iteration: 28
Predicted class-name: bib (#941), score: 19.26%
Gradient min: -0.000043, max: 0.000043, stepsize: 214742.82
Loss: 17.7469Iteration: 29
Predicted class-name: bib (#941), score: 18.87%
Gradient min: -0.000047, max: 0.000059, stepsize: 218511.00
Loss: 17.9321
plot_image(image)复制代码
为卷积层优化多张图像
下面,我们为Inception模型中的卷积层优化多张图像,并绘制它们。这些图像展示了卷积层“想看到的”内容。注意更深的层次里图案变得越来越复杂。
optimize_images(conv_id=0, num_iterations=10)复制代码
Layer: conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
optimize_images(conv_id=3, num_iterations=30)复制代码
Layer: conv_3/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=4, num_iterations=30)复制代码
Layer: conv_4/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=5, num_iterations=30)复制代码
Layer: mixed/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=6, num_iterations=30)复制代码
Layer: mixed/tower/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=7, num_iterations=30)复制代码
Layer: mixed/tower/conv_1/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=8, num_iterations=30)复制代码
Layer: mixed/tower_1/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=9, num_iterations=30)复制代码
Layer: mixed/tower_1/conv_1/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=10, num_iterations=30)复制代码
Layer: mixed/tower_1/conv_2/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=20, num_iterations=30)复制代码
Layer: mixed_2/tower/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=30, num_iterations=30)复制代码
Layer: mixed_4/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=40, num_iterations=30)复制代码
Layer: mixed_5/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=50, num_iterations=30)复制代码
Layer: mixed_6/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=60, num_iterations=30)复制代码
Layer: mixed_7/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=70, num_iterations=30)复制代码
Layer: mixed_8/tower/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=80, num_iterations=30)复制代码
Layer: mixed_9/tower_1/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=90, num_iterations=30)复制代码
Layer: mixed_10/tower_1/conv_1/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
optimize_images(conv_id=93, num_iterations=30)复制代码
Layer: mixed_10/tower_2/conv/Conv2D
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
Softmax前最终的全连接层
现在,我们为Inception模型中的最后一层优化并绘制图像。这是在softmax分类器前的全连接层。该层特征对应了输出的类别。
我们可能希望在这些图像里看到一些可识别的图案,比如对应输出类别的猴子、鸟类等,但图像只显示了一些复杂的、抽象的图案。
optimize_images(conv_id=None, num_iterations=30)复制代码
Final fully-connected layer before softmax.
Optimizing image for feature no. 1
Optimizing image for feature no. 2
Optimizing image for feature no. 3
Optimizing image for feature no. 4
Optimizing image for feature no. 5
Optimizing image for feature no. 6
上面只显示了100x100像素的图像,但实际上是299x299像素。如果我们执行更多的优化迭代并画出完整的图像,可能会有一些可识别的模式。那么,让我们再次优化第一张图像,并以全分辨率来绘制。
Inception模型以大约100%的确信度将结果图像分类成“敏狐”,但在人眼看来,图像只是一些抽象的图案。
如果你想测试另一个特征号码,要注意,号码必须介于0到1000之间,因为它对应了最终输出层的一个有效类别号。
image = optimize_image(conv_id=None, feature=1,
num_iterations=100, show_progress=True)复制代码
Iteration: 0
Predicted class-name: dishwasher (#667), score: 4.98%
Gradient min: -0.006252, max: 0.004451, stepsize: 3734.48
Loss: -0.837608Iteration: 1
Predicted class-name: ballpoint (#907), score: 8.52%
Gradient min: -0.007303, max: 0.006427, stepsize: 2152.89
Loss: -0.416723
...
Iteration: 98
Predicted class-name: kit fox (#1), score: 100.00%
Gradient min: -0.007732, max: 0.010692, stepsize: 1286.44
Loss: 67.5603Iteration: 99
Predicted class-name: kit fox (#1), score: 100.00%
Gradient min: -0.005850, max: 0.006159, stepsize: 1863.65
Loss: 75.6356
plot_image(image=image)复制代码
关闭TensorFlow会话
在上面使用Inception模型的函数中已经关闭了TensorFlow会话。这么做是为了节省内存,因此当计算图中添加了很多梯度函数时,电脑不会奔溃。
总结
这篇教程说明了如何优化输入图像,使得神经网络内的特征最大化。由于神经网络内给定特征(或神经元)对特定的图像反应最强烈,这让我们可以对其“喜欢看到的东西”进行可视化分析。
对神经网络的较低层,图像包含了简单的图案,比如不同类型的波浪线。随着网络越来越深,图像模式越来越复杂。我们可能会希望深层网络的模式是可识别的,比如猴子、狐狸、汽车等等,但实际上深层网络的图像模式更加复杂和抽象。
这是为什么?回想在教程 #11中,Inception模型很容易就被一些对抗噪声糊弄,而将任何输入图分类为另外的目标类别。因此,不难想象Inception模型可以识别这些在人眼看来并不清楚的抽象图像模式。可能存在无穷多的能够最大化神经网络内部特征的图像,并且人类只能识别出其中的一小部分。这也许是优化过程只找到抽象图像模式的原因。
其他方法
研究文献中还有许多指导优化过程的建议,从而找到人类更易识别的图像模式。
这篇文章提出了一种结合启发式来引导图像模式的优化过程。论文中展示了一些类别的样本图像,比如火烈鸟、鹈鹕、黑天鹅,人眼多多少少都能识别出来。在这里有方法的实现(精确的行数以后可能会改变)。这个方法需要启发式的组合并对参数进行微调,以生成这些图像。但论文中参数的选择并不明确。尽管尝试了一番,我还是无法重现他们的结果。也许我误解了这篇论文,或许启发式对他们网络架构(一种AlexNet的变体)的微调是好的,然而这篇教程中用的是更先进的Inception模型。
这篇文章提出了另一种生成人眼可识别的图像的方法。然而,实际上这个方法作弊了,因为它遍历训练集中的所有图像(比如ImageNet),找到能最大激活神经网络中给定特征的图像。然后对相似的图像做聚类和平均。将这个作为优化程序的初始图像。因此,当使用从真实照片构造的图像时,这个方法能得到更好的结果也不足为怪了。
练习
下面使一些可能会让你提升TensorFlow技能的一些建议练习。为了学习如何更合适地使用TensorFlow,实践经验是很重要的。
在你对这个Notebook进行修改之前,可能需要先备份一下。
- 尝试为网络中较低层的特征运行多次优化。得到的图像总是相同吗?
- 试着用更少或更多的优化迭代。这对图像质量有何影响?
- 试着改变卷积特征的损失函数。这可以用不同的方法来做。它将如何影响图样模式?为什么?
- 你认为优化器除了增大我们想要最大化的那个特征之外,会放大其他特征吗?你要怎么度量这个?你确定优化器一次只会最大化一个特征吗?
- 试着同时最大化多个特征。
- 在MNIST数据集上训练一个小一点的网络,然后试着对特征和层次做可视化。会更容易在图像中看到图案吗?
- 试着实现上述论文中的方法。
- 试着用你自己的方法来改善优化的图像。
- 向朋友解释程序如何工作。