前言
类激活图(CAM,class activation map)技术有助于帮助我们了解一张图像的那一部分让卷积神经网络作出最终的分类决策,它对于输入图像的每个位置都进行计算,表示每个位置对该类别的重要程度。
一、加载VGG16网络权重
from keras.applications.vgg16 import VGG16
model = VGG16(weights='./VGG16_fc.h5') # 包含最后的全连接层
二、为VGG16模型预处理一张输入图像
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input,decode_predictions
import numpy as np
img_path = './image/elephant.jpeg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('predicted: ', decode_predictions(preds, top=3)[0])
print(np.argmax(preds[0]))
输出结果为:
predicted: [(‘n02504458’, ‘African_elephant’, 0.914756), (‘n01871265’, ‘tusker’, 0.08050947), (‘n02504013’, ‘Indian_elephant’, 0.004628111)]
386
预测非洲象的概率为:91.48%
长牙动物:8%
印度象:0.46%
预测向量中被最大激活的元素是对应非洲象类别元素,其索引编号为:386
三、应用Gras_CAM算方法
elephant_output = model.output[:, 386]
last_conv_layer = model.get_layer('block5_conv3')
grads = K.gradients(elephant_output, last_conv_layer.output)[0]
pooled_grads = K.mean(grads, axis=(0, 1, 2))
iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
pooled_grads_value, conv_layer_output_value = iterate([x])
for i in range(512):
conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
heatmap = np.mean(conv_layer_output_value, axis=-1)
#热力图后处理
heatmap = np.maximum(heatmap, 0)
heatmap /= np.max(heatmap)
plt.matshow(heatmap)
plt.show()
输出图片为:
四、与原始图像进行叠加
img = cv2.imread(img_path)
heatmap = cv2.resize(heatmap, (img.shape[1], img.shape[0]))
heatmap = np.uint8(255*heatmap)
heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)
superimposed_img = heatmap * 0.4 + img
cv2.imwrite('./image/elephant_heapmap.jpeg', superimposed_img)
输出的图片为:
五、结论
从热力图中可以看出,小象耳朵的激活强度很大,这可能是网络找到的非洲象和印度象的不同之处。
参考文献
- 佛朗索瓦.肖莱著,张亮译. Python深度学习[M]. 人民邮电出版社.2018.8.p142-145.