Siamese Network Training

最新推荐文章于 2024-08-24 14:59:55 发布

maocaisheng

最新推荐文章于 2024-08-24 14:59:55 发布

阅读量5.6k

点赞数

分类专栏：图像检索 caffe学习文章标签： caffe Pairwise

本文链接：https://blog.csdn.net/u012938704/article/details/52679711

版权

图像检索同时被 2 个专栏收录

13 篇文章 5 订阅

订阅专栏

caffe学习

9 篇文章 0 订阅

订阅专栏

Pairwise相似度学习输入可以表示为 $(x,x_p,sim)$ ，Pointwise特征学习输入为 $(x,label)$ 。
下面根据Caffe官网教程介绍如何训练孪生网络。

训练集处理

caffe都是从数据库中读取数据，数据库中每一项即为一个训练样本。因此首先需要将输入 $(x,x_p)$ 通过不同通道存储在一个Datum中。
先看一下caffe.proto中定义的Datum:

message Datum {
  optional int32 channels = 1;
  optional int32 height = 2;
  optional int32 width = 3;
  // the actual image data, in bytes
  optional bytes data = 4;
  optional int32 label = 5;
  // Optionally, the datum could also hold float data.
  repeated float float_data = 6;
  // If true data contains an encoded image that need to be decoded
  optional bool encoded = 7 [default = false];
}

输入 $sim=1$ 表示相似，反之则表示不相似。
convert_data_siamese_data.cpp中：

  caffe::Datum datum;
  datum.set_channels(2);  // one channel for each image in the pair
  datum.set_height(rows);
  datum.set_width(cols);
  for (int itemid = 0; itemid < num_items; ++itemid) {
    int i = caffe::caffe_rng_rand() % num_items;  // pick a random  pair
    int j = caffe::caffe_rng_rand() % num_items;
    read_image(&image_file, &label_file, i, rows, cols,
        pixels, &label_i);
    read_image(&image_file, &label_file, j, rows, cols,
        pixels + (rows * cols), &label_j);
    datum.set_data(pixels, 2*rows*cols);
    if (label_i  == label_j) {
      datum.set_label(1);
    } else {
      datum.set_label(0);
    }
    datum.SerializeToString(&value);
    std::string key_str = caffe::format_int(itemid, 8);
    db->Put(leveldb::WriteOptions(), key_str, value);
  }

网络结构

Siamese Network主要有三个不同，

Data层后有一个Slice层，Slice层顾名思义是分割层，将Data层读取出的 pair_data 分成 data, data_p。
Loss层改为ContrastiveLoss （参考：Raia Hadsell, Sumit Chopra, and Yann LeCun “Dimensionality Reduction by Learning an Invariant Mapping”）该层会使相似的训练样本在特征空间相距较近。
给权值层（卷积层，全连接层）参数命名，实现权值共享。

…
param { name: “conv1_w” … }
param { name: “conv1_b” … }
…
param { name: “conv2_w” … }
param { name: “conv2_b” … }
…
param { name: “ip1_w” … }
param { name: “ip1_b” … }
…
param { name: “ip2_w” … }
param { name: “ip2_b” … }
…

可视化后的拓扑结构如下：
这里写图片描述

可视化测试结果

摘自mnist_siamese.ipynb。

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Make sure that caffe is on the python path:
caffe_root = '../../'  # this file is expected to be in {caffe_root}/examples/siamese
import sys
sys.path.insert(0, caffe_root + 'python')

import caffe

MODEL_FILE = 'mnist_siamese.prototxt'
# decrease if you want to preview during training
PRETRAINED_FILE = 'mnist_siamese_iter_50000.caffemodel' 
caffe.set_mode_cpu()
net = caffe.Net(MODEL_FILE, PRETRAINED_FILE, caffe.TEST)

EST_DATA_FILE = '../../data/mnist/t10k-images-idx3-ubyte'
TEST_LABEL_FILE = '../../data/mnist/t10k-labels-idx1-ubyte'
n = 10000

with open(TEST_DATA_FILE, 'rb') as f:
    f.read(16) # skip the header
    raw_data = np.fromstring(f.read(n * 28*28), dtype=np.uint8)

with open(TEST_LABEL_FILE, 'rb') as f:
    f.read(8) # skip the header
    labels = np.fromstring(f.read(n), dtype=np.uint8)

# reshape and preprocess
caffe_in = raw_data.reshape(n, 1, 28, 28) * 0.00390625 # manually scale data instead of using `caffe.io.Transformer`
out = net.forward_all(data=caffe_in)

feat = out['feat']
f = plt.figure(figsize=(16,9))
c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 
     '#ff00ff', '#990000', '#999900', '#009900', '#009999']
for i in range(10):
    plt.plot(feat[labels==i,0].flatten(), feat[labels==i,1].flatten(), '.', c=c[i])
plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])
plt.grid()
plt.show()