Siamese network embedding
该例子是利用Caffe完成Siamese网络的特征抽取和plotting的例子。本文为了运行于本机,只做了细微修改和中文说明,原始文件来源于Caffe官网对应的Notebook Examples。http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/siamese/mnist_siamese.ipynb
---Last update 2015年6月7日
Setup
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# 切换工作目录到 caffe-master
%cd '/home/ouxinyu/caffe-master'
# Make sure that caffe is on the python path:
caffe_root = './' # this file is expected to be in {caffe_root}/examples/siamese
import sys
sys.path.insert(0, caffe_root + 'python')
import caffe
下载训练数据并生成训练模型
要完成本例需要先完成Siamese的训练,具体细节参考Caffe官网文档:http://caffe.berkeleyvision.org/gathered/examples/siamese.html
# 如果没有下载mnist数据,请先下载,可以使用ssh运行,也可以直接在ipython notebook中运行
# 需要联网才能完成下载
# ~/caffe-master$ data/mnist/get_mnist.sh
! data/mnist/get_mnist.sh
# ~/caffe-master$ examples/siamese/create_mnist_siamese.sh
! examples/siamese/create_mnist_siamese.sh
#~/caffe-master$ examples/siamese/train_mnist_siamese.sh
# 默认训练50000iter,大约5-10分钟
! examples/siamese/train_mnist_siamese.sh
Load the trained net
MODEL_FILE = 'examples/siamese/mnist_siamese.prototxt'
# decrease if you want to preview during training
PRETRAINED_FILE = 'examples/siamese/mnist_siamese_iter_50000.caffemodel'
caffe.set_mode_cpu()
net = caffe.Net(MODEL_FILE, PRETRAINED_FILE, caffe.TEST)
Load MNIST test data
TEST_DATA_FILE = 'data/mnist/t10k-images-idx3-ubyte'
TEST_LABEL_FILE = 'data/mnist/t10k-labels-idx1-ubyte'
n = 10000
with open(TEST_DATA_FILE, 'rb') as f:
f.read(16) # skip the header
raw_data = np.fromstring(f.read(n * 28*28), dtype=np.uint8)
with open(TEST_LABEL_FILE, 'rb') as f:
f.read(8) # skip the header
labels = np.fromstring(f.read(n), dtype=np.uint8)
Generate the Siamese features
# reshape and preprocess
caffe_in = raw_data.reshape(n, 1, 28, 28) * 0.00390625 # manually scale data instead of using `caffe.io.Transformer`
out = net.forward_all(data=caffe_in)
Visualize the learned Siamese embedding
feat = out['feat']
f = plt.figure(figsize=(16,9))
c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff',
'#ff00ff', '#990000', '#999900', '#009900', '#009999']
for i in range(10):
plt.plot(feat[labels==i,0].flatten(), feat[labels==i,1].flatten(), '.', c=c[i])
plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])
plt.grid()
plt.show()