Notes
CNN-F 即 VGG-F。原论文是 [1],其介绍页在 [2](文中号称这里有代码)。
[3] 说用到 CNN-F,而提供下载的预训练参数文件是 imagenet-vgg-f.mat;
[5] 中说用到 CNN-F 和 VGG-19 两种,其中 CNN-F 引的是 [1]。从初始化预训练参数代码上看,估计是搬了 [1] 的代码;
caffe Model Zoo[10] 里的文件命名直接是:VGG_CNN_F。
[7] 中用的 CNN-F 是 caffe 的,它的代码改过 caffe 原有代码, GPU 下还没跑通,CPU 下可以。将 CNN-F 在 ImageNet 预训练的模型参数读进 tensorflow 有助于摆脱 caffe……
Visualization
[9] 是在线可视化 caffe 模型的工具,用它打开 .prototxt 文件可以看 caffe 定义的网络结构,用来对照。
具体可以翻 [8] 中用到的 .prototxt 用 [9] 打开。
tensorflow 就用 tensorboard。
Code
参照 [4] 和 [6]。原本的代码要用到他们自己写的一些 layers(被注释掉那些),这里改成用 tf 自带的层。(以后有问题可以改回去)
import os
import numpy as np
import scipy.io as sio
import tensorflow as tf
#print(tf.__version__) # 1.12.0
P = "G:/dataset"
VGG_CNN_F = "imagenet-vgg-f.mat"
layers = ('conv1', 'relu1', 'norm1', 'pool1',
'conv2', 'relu2', 'norm2', 'pool2',
'conv3', 'relu3', 'conv4', 'relu4',
'conv5', 'relu5', 'pool5',
'fc6', 'relu6', 'fc7', 'relu7')
cnnf = sio.loadmat(os.path.join(P, VGG_CNN_F))
weights = cnnf["layers"][0]
#print(weights.shape)
# 输入
in_img = tf.placeholder(tf.float32, shape=[None, 224, 224, 3], name="in_image")
# 加载 CNN-F 参数
cnn_f = {}
current = in_img
for i, name in enumerate(layers):
if name.startswith('conv'):
kernels, bias = weights[i][0][0][0][0]
bias = bias.reshape(-1)
pad = weights[i][0][0][1]
stride = weights[i][0][0][4]
#current = conv_layer(current, kernels, bias, pad, stride, i, labnet)
current = tf.layers.conv2d(current, kernels.shape[-1], kernels.shape[0],
strides=stride[0],
padding="valid" if pad[0][0]==0 else "same",
kernel_initializer=tf.initializers.constant(kernels),
bias_initializer=tf.initializers.constant(bias),
name=name)
elif name.startswith('relu'):
current = tf.nn.relu(current)
elif name.startswith('pool'):
stride = weights[i][0][0][1]
pad = weights[i][0][0][2]
area = weights[i][0][0][5]
#current = pool_layer(current, stride, pad, area)
current = tf.layers.max_pooling2d(current, area[0], stride[0],
padding="valid" if np.sum(pad)==0 else "same",
name=name)
elif name.startswith('fc'):
kernels, bias = weights[i][0][0][0][0]
bias = bias.reshape(-1)
#current = full_conv(current, kernels, bias, i, labnet
current = tf.layers.conv2d(current, kernels.shape[-1], kernels.shape[0],
#strides=(1, 1),
padding="valid",
kernel_initializer=tf.initializers.constant(kernels),
bias_initializer=tf.initializers.constant(bias),
name=name)
elif name.startswith('norm'):
current = tf.nn.local_response_normalization(current,
depth_radius=2,
bias=2.000,
alpha=0.0001,
beta=0.75)
cnn_f[name] = current
# 画图
with tf.Session() as sess:
writer = tf.summary.FileWriter("G:/TensorBoard/VGG-CCN-F", sess.graph)
Comparison
用 tensorboard 显示图结构对比,发现同 [8] 中的结构比,少了最后两层 fc 接尾的 dropout:
Files
参照 DCMH 的 issue #4,作者提供的新文件下载链是 vgg-net.zip,但这文件不能直接套 [3] 的代码,要做些修改,对此文件的解析见 iTomxy/test.CNN-F/test.cnnf.ipynb(或 test.cnnf.ipynb)。
文件放在百度云盘共享:链接:cnnf-vggf,提取码:6b1m
。
iTomxy/test.CNN-F 是使用示例。
References
- bmvc 2014 | Return of the Devil in the Details Delving Deep into Convolutional Nets
- [1] 的 blog
- cvpr 2017 | Deep Cross-Modal Hashing
- [3] 的 code
- cvpr 2018 | Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
- [5] 的 code
- TCSVT 2017 | SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval
- [7] 的 code
- Netron
- caffe model zoo