这几天看了Locality-constrained Linear Coding for Image Classification算法,里面涉及到coding与pooling过程,在此做一解析:
1)Feature Extract
在代码部分,作者用了Lazebnik's SIFT算法提取 Dense Sift特征
CalculateSiftDescriptor(rt_img_dir, rt_data_dir, gridSpacing, patchSize, maxImSize, nrml_threshold)
其中patchSize为提取Sift特征的patch大小,gridSpacing为patch移动的步长
2)coding
coding过程其实是LLE (Locally Linear Embedding) 局部线性嵌入算法(链接)的前两步
(1)寻找每个样本点的k个近邻点
% find k nearest neighbors
% B -M x d codebook, M entries in a d-dim space
% X -N x d matrix, N data points in a d-dim space
XX = sum(X.*X, 2);
BB = sum(B.*B, 2);
D = repmat(XX, 1, nbase)-2*X*B'+repmat(BB', nframe, 1);
IDX = zero