**摘要:**triplet loss 可以提高特征匹配的性能,可用物体识别,人脸识别,检索等方面,本文用matlab实现triplet loss。
triplet loss 就是学习一个函数隐射
f(x)
, 从特征
x
映射到
min w 1m ∑ i m ||f(x a i −f(x p i ))|| 2 2 −||f(x a i −f(x n i ))|| 2 2 +α
其中
f(x)=wx
,
x a i
是锚点,
x p i
是正样本点,它和
x a i
属于同一类别,
x n i
是负样本点,它和
x a i
不属于同一类别。
这样我们可以通过无约束优化来约束上面函数。
function demo_tripletloss
clear all
clc
data{1} = rand(600,300);
data{2} = rand(600,300);
data{3} = rand(600,300);
inputSize = 600;
hiddenSize = 400;
theta = initializeParameters(hiddenSize, inputSize);
addpath minFunc/
options.Method = 'cg'; % Here, we use L-BFGS to optimize our cost
% function. Generally, for minFunc to work, you
% need a function pointer with two outputs: the
% function value and the gradient. In our problem,
% sparseAutoencoderCost.m satisfies this.
options.maxIter = 400; % Maximum number of iterations of L-BFGS to run
options.display = 'on';
[opttheta, cost] = minFunc( @(p) tripletCost(p, inputSize, hiddenSize, data),theta, options);
tripletCost.m
function [cost,grad] = tripletCost(theta, inputSize, hiddenSize, data)
%================================================
%何凌霄 中科院自动化所
%创建时间:2016年5月25日
%================================================
% data: trainning sample for triplet loss
% W: transformer matrix
% cost: cost function
% grad: gradient direction
W = reshape(theta, hiddenSize, inputSize);
% Cost and gradient variables (your code needs to compute these values).
% Here, We initialize them to zeros.
cost = 0;
Wgrad = zeros(size(W));
% the gradient descent update to W1 Would be W1 := W1 - alpha * W1grad, and similarly for W2, b1, b2.
%
[n m] = size(data{1});%m: the number of samples,m:the dim of feature
bias = 0.2;
%forWard
% calc cost
for i = 1:m
cost = cost + ((W*data{1}(:,i) - W*data{2}(:,i))'*(W *data{1}(:,i) - W*data{2}(:,i)) - (W*data{1}(:,i) - W*data{3}(:,i))'*(W *data{1}(:,i) - W*data{3}(:,i)) +bias)/m;
end
%calc gradient
%计算W1grad
Wgrad = (2*W*((data{1} - data{2})*(data{1} - data{2})' - (data{1} - data{3})*(data{1} - data{3})'))/m;
%-------------------------------------------------------------------
% After computing the cost and gradient, We Will convert the gradients back
% to a vector format (suitable for minFunc). Specifically, We Will unroll
% your gradient matrices into a vector.
grad = Wgrad(:);
end
initializeParameters.m
function theta = initializeParameters(hiddenSize, inputSize)
% Initialize parameters randomly based on layer sizes.
r = sqrt(6) / sqrt(hiddenSize+inputSize+1); % we'll choose weights uniformly from the interval [-r, r]
W1 = rand(hiddenSize, inputSize) * 2 * r - r;
theta = W1(:);
end
另外需要一个优化包见资源.