Pairwise相似度学习输入可以表示为
(x,xp,sim)
,Pointwise特征学习输入为
(x,label)
。
下面根据Caffe官网教程介绍如何训练孪生网络。
训练集处理
caffe都是从数据库中读取数据,数据库中每一项即为一个训练样本。因此首先需要将输入
(x,xp)
通过不同通道存储在一个Datum中。
先看一下caffe.proto中定义的Datum:
message Datum {
optional int32 channels = 1;
optional int32 height = 2;
optional int32 width = 3;
// the actual image data, in bytes
optional bytes data = 4;
optional int32 label = 5;
// Optionally, the datum could also hold float data.
repeated float float_data = 6;
// If true data contains an encoded image that need to be decoded
optional bool encoded = 7 [default = false];
}
输入
sim=1
表示相似,反之则表示不相似。
convert_data_siamese_data.cpp中:
caffe::Datum datum;
datum.set_channels(2); // one channel for each image in the pair
datum.set_height(rows);
datum.set_width(cols);
for (int itemid = 0; itemid < num_items; ++itemid) {
int i = caffe::caffe_rng_rand() % num_items; // pick a random pair
int j = caffe::caffe_rng_rand() % num_items;
read_image(&image_file, &label_file, i, rows, cols,
pixels, &label_i);
read_image(&image_file, &label_file, j, rows, cols,
pixels + (rows * cols), &label_j);
datum.set_data(pixels, 2*rows*cols);
if (label_i == label_j) {
datum.set_label(1);
} else {
datum.set_label(0);
}
datum.SerializeToString(&value);
std::string key_str = caffe::format_int(itemid, 8);
db->Put(leveldb::WriteOptions(), key_str, value);
}
网络结构
Siamese Network主要有三个不同,
- Data层后有一个Slice层,Slice层顾名思义是分割层,将Data层读取出的 pair_data 分成 data, data_p。
- Loss层改为ContrastiveLoss (参考:Raia Hadsell, Sumit Chopra, and Yann LeCun “Dimensionality Reduction by Learning an Invariant Mapping”)该层会使相似的训练样本在特征空间相距较近。
给权值层(卷积层,全连接层)参数命名,实现权值共享。
…
param { name: “conv1_w” … }
param { name: “conv1_b” … }
…
param { name: “conv2_w” … }
param { name: “conv2_b” … }
…
param { name: “ip1_w” … }
param { name: “ip1_b” … }
…
param { name: “ip2_w” … }
param { name: “ip2_b” … }
…
可视化后的拓扑结构如下:
可视化测试结果
摘自mnist_siamese.ipynb。
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# Make sure that caffe is on the python path:
caffe_root = '../../' # this file is expected to be in {caffe_root}/examples/siamese
import sys
sys.path.insert(0, caffe_root + 'python')
import caffe
MODEL_FILE = 'mnist_siamese.prototxt'
# decrease if you want to preview during training
PRETRAINED_FILE = 'mnist_siamese_iter_50000.caffemodel'
caffe.set_mode_cpu()
net = caffe.Net(MODEL_FILE, PRETRAINED_FILE, caffe.TEST)
EST_DATA_FILE = '../../data/mnist/t10k-images-idx3-ubyte'
TEST_LABEL_FILE = '../../data/mnist/t10k-labels-idx1-ubyte'
n = 10000
with open(TEST_DATA_FILE, 'rb') as f:
f.read(16) # skip the header
raw_data = np.fromstring(f.read(n * 28*28), dtype=np.uint8)
with open(TEST_LABEL_FILE, 'rb') as f:
f.read(8) # skip the header
labels = np.fromstring(f.read(n), dtype=np.uint8)
# reshape and preprocess
caffe_in = raw_data.reshape(n, 1, 28, 28) * 0.00390625 # manually scale data instead of using `caffe.io.Transformer`
out = net.forward_all(data=caffe_in)
feat = out['feat']
f = plt.figure(figsize=(16,9))
c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff',
'#ff00ff', '#990000', '#999900', '#009900', '#009999']
for i in range(10):
plt.plot(feat[labels==i,0].flatten(), feat[labels==i,1].flatten(), '.', c=c[i])
plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])
plt.grid()
plt.show()
这里测试网络如下:
最终输出结果是一个2维向量。也就是说将每一个样本映射为2维坐标(x , y),由于ContrastiveLoss损失层的效果,使得相似的样本在映射后的空间中相距较近,达到聚类的效果。结果图中,不同颜色表示不同类。