https://www.zybuluo.com/lutingting/note/555007
首先这篇文章很好的解释了SIFT的来龙去脉
首先尺度的问题,在不同尺度下观测到的东西是不同的,不同的尺度下对比相似事物也是不可能的。
SIFT特征提取,分为4个步骤:
- 尺度空间极值检验
- 关键点位置和尺度确定
- 关键点方向确定
- 特征向量生成
尺度空间极值检验
SIFT特征点就是尺度空间中稳定的点/极值点,为了得到这些点:
- 首先对输入图像建立尺度空间(通过图像金字塔)
- 从建立得到的尺度空间中检测极值点(转换为图像差分高斯金字塔中极值点的检验)
- 获取差分高斯金字塔
- DoG中极值点的检验
为什么要有不同的octave?
如果保持同样的尺度,那么当高斯滤波增大到很大的时候,滤波窗口就会变得很大,使计算量激增,如果先减小尺寸再滤波,就会好很多。
为什么不直接用DoG计算?
为了减少计算量。
对一副原始图像用方差为的高斯函数连续做两次滤波的结果,相当于对这幅图像直接用一个的高斯函数做一次滤波。
这句话有待研究
LoG和DoG相比,LoG计算量更大
得到DoG后,大值代表区别大,小值代表区别小
稳定特征点的判断,上下两个9加周围的8,一共27个点包括自己。
而且必须要上下都有才行
== 为什么S+3层呢?==
关键点位置及尺度确定
为了提高匹配稳定性,提高抗噪声能力,将两类点去除:
- 低对比度的点
- 边缘点
关键点方向确定
SiFT的特征匹配
- 取图像A中的某个关键点,并找出其与图像B中欧式距离最近的前两个关键点
- 在这两个关键点中,如果最近的距离除以次近的距离少于某个比例阈值,则接受这一对匹配点。降低这个比例阈值,SIFT匹配点数目会减少,但更加稳定。
- 利用2个近邻点比较的目的是为了排除因为图像遮挡和背景混乱而产生的无匹配关系的关键点,所以Lowe提出了比较最近邻距离与次近邻距离的方法,距离比率ratio小于某个阈值的认为是正确匹配。因为对于错误匹配,由于特征空间的高维性,相似的距离可能有大量其他的错误匹配,从而它的ratio值比较高。Lowe推荐ratio的阈值为0.8。但作者对大量任意存在尺度、旋转和亮度变化的两幅图片进行匹配,结果表明ratio取值在0. 4~0. 6之间最佳,小于0. 4的很少有匹配点,大于0. 6的则存在大量错误匹配点。(如果这个地方你要改进,最好给出一个匹配率和ration之间的关系图,这样才有说服力),作者建议ratio的取值原则如下:
ratio=0. 4 对于准确度要求高的匹配;
ratio=0. 6 对于匹配点数目要求比较多的匹配;
ratio=0. 5 一般情况下。
也可按如下原则: 当最近邻距离<200时,ratio=0. 6;反之,ratio=0. 4。ratio的取值策略能排分错误匹配点。
Illumination Invariance
red-blue difference is identical in both cases
Scale Space
pyramid representation
Exercise with SIFT
Four parts:
- detection of feature points
- description of feature points
- correspondence of feature points
- monocular visual odometry
下面是实验代码,他是对SIFT做了阉割
Detection Code(Matlab)
run_detection.m
i = 0;
filename = sprintf('images/%04d.png', i);
I = imread(filename);
[features, gradient_data] = detect_sift_light_features(I);
visualize_sift_light(I, features, 'SIFT(light)');
detect_sift_light_features.m
function [features, gradient_data] = detect_sift_light_features(I)
% DETECT_SIFT_LIGHT_FEATURES Detects SIFT (light) features in the image.
% Features returns the scale and image location of features as Nx4 matrix
% with row containing: [u v scale scale_id]. gradient_data is a structure (see
% below) that contains the gradients and orientations for all scales. Note,
% that scale_id in features must be used to access the gradients and
% orientations in gradient_data.
Idouble = double(I);
[height, width] = size(Idouble);
% discretization of scale space (sigma in lecture slides)
num_scales = 10;
scales = power(2, (0:num_scales) ./ 3);
% discretization of image and scale space L(sigma, v, u).
L = zeros(num_scales, height, width);%Log,octave
% a structure is a container to hold closely related data fields, here
% it holds the gradients and orientations for each scale,定义梯度信息
gradient_data = struct();
gradient_data.gradient_x = zeros(num_scales, height, width);
gradient_data.gradient_y = zeros(num_scales, height, width);
gradient_data.gradients = zeros(num_scales, height, width);
gradient_data.orientations = zeros(num_scales, height, width);
% sobel filter for gradients in y-direction (see assignment 01)
sobel_y = fspecial('sobel');
% sobel filter for gradients in x-direction (see assignment 01)
sobel_x = sobel_y';
for i = 1:num_scales
scale = scales(i);
% computation of gradients and orientations, the images are blurred by
% Gaussian with the scale as standard deviation, note: the gradients are
% used later for the descriptor computation, however they are computed here
% for reasons of simplicity and returned by this function for later use,求梯度信息,先高斯完,然后再索贝尔求梯度,然后求出梯度的方向和大小信息
gaussian = fspecial('gaussian', [5 5], scale);
Igaussian = imfilter(Idouble, gaussian);
Isobel_y = imfilter(Igaussian, sobel_y);
Isobel_x = imfilter(Igaussian, sobel_x);
gradient_data.gradients(i,:,:) = sqrt(power(Isobel_y,2) + power(Isobel_x,2));
gradient_data.orientations(i,:,:) = atan2(Isobel_y, Isobel_x);
% convolution with LoG using the scale
% store the result in the tensor
LoG = fspecial('log', 2*round(scale*[5 5])+1, scale);%round是四舍五入函数,LoG的卷积核是随sigma的增大而增大的。
L(i,:,:) = imfilter(Idouble, LoG);%这里不是用的那种减的方式,而是卷积
end
% compute the local maxima in a 15-by-15 environment
% it returns the indices of the maxima for each dimension,得到15*15方块里的极值的信息
[idx_scale, idx_v, idx_u] = localMaximum(L, 15, true);%
num_maxima = length(idx_scale);
% store the filter responses for each maximum together with the scale and
% image coordinates. each row of local_maximum_data contains:
% [u, v, scale, scale_id, LoG-response]
local_maximum_data = zeros(num_maxima, 5);
for i = 1:num_maxima
local_maximum_data(i, 1:4) = [idx_u(i), idx_v(i), scales(idx_scale(i)), idx_scale(i)];%这个地方是最大的两个值吧
local_maximum_data(i, 5) = L(idx_scale(i), idx_v(i), idx_u(i));%最大的数对应的L值
end
% compute the 0.5-quantil of the filter responses as threshold. Note: there
% is a Matlab function to compute the median, however the
% quantile()-function is more generic
threshold_rho = quantile(local_maximum_data(:,5), 0.5);%
% filter the feature points with a higher filter response than the
% threshold
filtered_indices = local_maximum_data(:,5) > threshold_rho;
filtered_features = local_maximum_data(filtered_indices, 1:4);
% add hessian matrix constraint
hessian_threshold_r = 10;
hessian_threshold = (hessian_threshold_r + 1)^2 / hessian_threshold_r;
features = zeros(0,4);
for i = 1:size(filtered_features,1)
x = filtered_features(i,1);
y = filtered_features(i,2);
scale_idx = filtered_features(i,4);
Ls = squeeze(L(scale_idx,:,:));
if (x > 2 && y > 2 && (x < width-1) && (y < height-1))
Dxx = (Ls(y,x+2) - 2*Ls(y,x) + Ls(y,x-2)) / 4;%这里是求Hessian,但是这个二次偏微分居然是这么表示,开了眼界
Dxy = ((Ls(y+1,x+1) - Ls(y+1,x-1)) - (Ls(y-1,x+1) - Ls(y-1,x-1))) / 4;
Dyy = (Ls(y+2,x) - 2*Ls(y,x) + Ls(y-2,x)) / 4;
Hessian = [Dxx, Dxy; Dxy Dyy];
trH = trace(Hessian);
detH = det(Hessian);
if (detH ~= 0)
if ((trH^2 / detH) < hessian_threshold)
features = [features; filtered_features(i,:)]; %#ok<*AGROW>
end
end
end
end
end
localMaximum.m
function varargout = localMaximum(x,minDist, exculdeEqualPoints)
% function varargout = localMaximum(x,minDist, exculdeEqualPoints)
%
% This function returns the indexes\subscripts of local maximum in the data x.
% x can be a vector or a matrix of any dimension
%
% minDist is the minimum distance between two peaks (local maxima)
% minDist should be a vector in which each argument corresponds to it's
% relevant dimension OR a number which is the minimum distance for all
% dimensions
%
% exculdeEqualPoints - is a boolean definning either to recognize points with the same value as peaks or not
% x = [1 2 3 4 4 4 4 4 4 3 3 3 2 1];
% will the program return all the '4' as peaks or not - defined by the 'exculdeEqualPoints'
% localMaximum(x,3)
% ans =
% 4 5 6 7 8 9 11 12
%
% localMaximum(x,3,true)
% ans =
% 4 7 12
%
%
% Example:
% a = randn(100,30,10);
% minDist = [10 3 5];
% peaks = localMaximum(a,minDist);
%
% To recieve the subscript instead of the index use:
% [xIn yIn zIn] = localMaximum(a,minDist);
%
% To find local minimum call the function with minus the variable:
% valleys = localMaximum(-a,minDist);
if nargin < 3
exculdeEqualPoints = false;
if nargin < 2
minDist = size(x)/10;
end
end
if isempty(minDist)
minDist = size(x)/10;
end
dimX = length ( size(x) );
if length(minDist) ~= dimX
% In case minimum distance isn't defined for all of x dimensions
% I use the first value as the default for all of the dimensions
minDist = minDist( ones(dimX,1) );
end
% validity checks
minDist = ceil(minDist);
minDist = max( [minDist(:)' ; ones(1,length(minDist))] );
minDist = min( [minDist ; size(x)] );
% ---------------------------------------------------------------------
if exculdeEqualPoints
% this section comes to solve the problem of a plato
% without this code, points with the same hight will be recognized as peaks
y = sort(x(:));
dY = diff(y);%计算差分当做step
% finding the minimum step in the data
minimumDiff = min( dY(dY ~= 0) );
%adding noise which won't affect the peaks
x = x + rand(size(x))*minimumDiff;
end
% ---------------------------------------------------------------------
se = ones(minDist);
X = imdilate(x,se);
f = find(x == X);
if nargout
[varargout{1:nargout}] = ind2sub( size(x), f );
else
varargout{1} = f;
end
Description Code (Matlab)
extrat_sift_light_descriptors.m
function [descriptors, valid_image_pixels] = extract_sift_light_descriptors(I, features, gradient_data)
% EXTRACT_SIFT_LIGHT_DESCRIPTORS Extracts SIFT (light) descriptors from the
% image given the features and the gradient_data as returned by
% detect_sift_light_features(..). valid_image_pixels contains the pixels as
% for which a descriptor has been computed. valid_image_pixels is a Nx2
% matrix with each row holding the pixel location [u, v]. Note, that the
% number of valid image pixels is <= the number of features.
[height, width] = size(I);
num_block = 4; % number of blocks for height and width
num_pixel = 4; % number of pixels per block for height and width
% this version uses a 4x4 blocks around the feature point with each block having
% 4x4 pixels. In total we get an area of 16x16 pixels around the feature point.
% The gradients are sorted per block into 8 orientation bins in the histogram so
% that the total descriptor size is 128
descriptors = zeros(0, num_block * num_pixel * 8);
valid_image_pixels = zeros(0, 2); % [u, v]
% as mentioned in the excercise text, the gradients are weighted by a
% Gaussian function with sigma being 8 (half the side of the 16x16 pixel area)
gaussian_weights = fspecial('gaussian', [num_block*num_pixel num_block*num_pixel], num_block*num_pixel/2);
for i = 1:size(features, 1)
p = features(i,1:2);
scale_index = features(i,4);
% extract the gradients and orientations for the scale of the feature
gradient_at_scale = squeeze(gradient_data.gradients(scale_index,:,:));%删除单一维度
orientation_at_scale = squeeze(gradient_data.orientations(scale_index,:,:));
% compute the upper-left and lower-right boundary for the local
% environment as [u, v]
upper_left = [p(1)-num_block*num_pixel/2+1 p(2)-num_block*num_pixel/2+1];%就是假设中心点是[8 8]左上是[1 1]
lower_right = [p(1)+num_block*num_pixel/2 p(2)-num_block*num_pixel/2];%右下是[16 16]
% check for validity - the feature is omitted if the 4x4 block lies
% outside of the image
if (upper_left(1) >= 1 && upper_left(2) >= 1 && ...
lower_right(1) <= width && lower_right(2) <= height)
% extract the gradients for the 16x16 local environment and weight the gradients
% by gaussians
gradient_block = gradient_at_scale(upper_left(2):lower_left(2),upper_left(1):lower_left(1)).*gaussian_weights;
% extract the orientations for the 16x16 local environment
orientation_block = orientation_at_scale(upper_left(2):lower_left(2),upper_left(1):lower_left(1).*gaussian_weights;
% compute the descriptor for each of the 4x4 block
descriptor = zeros(1, 128);
for row = 1:num_block
for col = 1:num_block
pos_in_descriptor = (row-1)*num_block*8 + (col-1)*8+1;
pos_in_block_u = (col-1)*num_pixel+1;
pos_in_block_v = (row-1)*num_pixel+1;
% 4x4 block of gradients
gradient_sub_block = gradient_block(pos_in_block_v:(pos_in_block_v+num_pixel-1), ...
pos_in_block_u:(pos_in_block_u+num_pixel-1));
orientation_sub_block = orientation_block(pos_in_block_v:(pos_in_block_v+num_pixel-1), ...
pos_in_block_u:(pos_in_block_u+num_pixel-1));
descriptor(1,pos_in_descriptor:(pos_in_descriptor+7)) = ...
compute_orientation_histogram(gradient_sub_block, orientation_sub_block);
end
end
% normalization of descriptor by L2-norm
descriptor = descriptor/norm(descriptor);
% add the descriptor and image pixel to the output data
descriptors = [descriptors; descriptor]; %#ok<AGROW>
valid_image_pixels = [valid_image_pixels; p]; %#ok<AGROW>
end
end
end
compute_orientation_histogram.m (参考了get_histogram_bin_assignment.m)
function histogram = compute_orientation_histogram(gradients, orientations)
% COMPUTE_ORIENTATION_HISTOGRAM Computes the 8-bin orientation histogram
histogram = zeros(1, 8);
[height, width] = size(gradients);
for row = 1:height
for col = 1:width
% extract orientation and gradient
o = orientations(row, col);
g = gradients(row, col);
% compute the assignment of gradients to bins (see comments in
% get_histogram_bin_assignment.m
[bin1, gradient1, bin2, gradient2] = get_histogram_bin_assignment(g, o);
% assign the gradient values to the histogram bins.
histogram(bin1) = histogram(bin1) + gradient1;
histogram(bin2) = histogram(bin2) + gradient2;
end
end
end
get_histogram_bin_assignment.m (就是分成8分,在那个角度里的就是了)
function [bin1, gradient1, bin2, gradient2] = get_histogram_bin_assignment(gradient, orientation)
% GET_HISTOGRAM_ASSIGNMENT Assigns a gradient two of 8 histogram bins given
% the orientation of the gradient.
% The eight histogram bins cover the orientation ranges (degrees):
% o [1] - center value: 0 - range: 337.5 - 22.5
% o [2] - center value: 45 - range: 22.5 - 67.5
% o ...
% o [8]: - center value: 315 - range: 292.5 - 337.5
%
% Bin1 is chosen by the orientation range which the orientation of the
% gradient lies in between, e.g. [2] is selected if the orientation is
% between 22.5 and 67.5 degrees. Bin2 however is selected by looking at the
% adjacents bins and selected the one whose center value is closer to the
% orientation.
%
% Gradient1 and gradient2 are assigned the input gradient proportionally to
% the distance of the orientation to the center values of bin1 and bin2.
% That is, the closer the orientation to the center value of a bin, the
% higher the proportion of gradient is assigned.
orientation = orientation*(orientation > 0) + (2*pi + orientation)*(orientation < 0);
if orientation >= deg2rad(337.5) || orientation <= deg2rad(22.5)
bin1 = 1;
if (orientation >= 0)
bin2 = 2;
gradient1 = gradient * (deg2rad(45) - orientation) / deg2rad(45);
gradient2 = gradient * orientation / deg2rad(45);
else
bin2 = 8;
gradient1 = gradient * (deg2rad(45) - (2*pi - orientation)) / deg2rad(45);
gradient2 = gradient * (2*pi - orientation) / deg2rad(45);
end
elseif orientation >= deg2rad(22.5) && orientation <= deg2rad(67.5)
bin1 = 2;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(45) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(45) - orientation) / deg2rad(45);
if (orientation >= deg2rad(45))
bin2 = 3;
else
bin2 = 1;
end
elseif orientation >= deg2rad(67.5) && orientation <= deg2rad(112.5)
bin1 = 3;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(90) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(90) - orientation) / deg2rad(45);
if (orientation >= deg2rad(90))
bin2 = 4;
else
bin2 = 2;
end
elseif orientation >= deg2rad(112.5) && orientation <= deg2rad(157.5)
bin1 = 4;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(135) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(135) - orientation) / deg2rad(45);
if (orientation >= deg2rad(135))
bin2 = 5;
else
bin2 = 3;
end
elseif orientation >= deg2rad(157.5) && orientation <= deg2rad(202.5)
bin1 = 5;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(180) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(180) - orientation) / deg2rad(45);
if (orientation >= deg2rad(180))
bin2 = 6;
else
bin2 = 4;
end
elseif orientation >= deg2rad(202.5) && orientation <= deg2rad(247.5)
bin1 = 6;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(225) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(225) - orientation) / deg2rad(45);
if (orientation >= deg2rad(225))
bin2 = 7;
else
bin2 = 5;
end
elseif orientation >= deg2rad(247.5) && orientation <= deg2rad(292.5)
bin1 = 7;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(270) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(270) - orientation) / deg2rad(45);
if (orientation >= deg2rad(270))
bin2 = 8;
else
bin2 = 6;
end
elseif orientation >= deg2rad(292.5) && orientation <= deg2rad(337.5)
bin1 = 8;
gradient1 = gradient * (deg2rad(45) - abs(deg2rad(315) - orientation)) / deg2rad(45);
gradient2 = gradient * abs(deg2rad(315) - orientation) / deg2rad(45);
if (orientation >= deg2rad(315))
bin2 = 1;
else
bin2 = 7;
end
end
end
run_matching.m
i = 0;
filename = sprintf('images/%04d.png', i);
I1 = imread(filename);
filename = sprintf('images/%04d.png', i+1);
I2 = imread(filename);
matches = compute_image_matches(I1, I2);
visualize_feature_matching(I1,I2, matches, 'test');
Correspondence of Feature Points
这里用的是KNN
function [idx,D]=knnsearch(varargin)
% KNNSEARCH Linear k-nearest neighbor (KNN) search
% IDX = knnsearch(Q,R,K) searches the reference data set R (n x d array
% representing n points in a d-dimensional space) to find the k-nearest
% neighbors of each query point represented by eahc row of Q (m x d array).
% The results are stored in the (m x K) index array, IDX.
%
% IDX = knnsearch(Q,R) takes the default value K=1.
%
% IDX = knnsearch(Q) or IDX = knnsearch(Q,[],K) does the search for R = Q.
%
% Rationality
% Linear KNN search is the simplest appraoch of KNN. The search is based on
% calculation of all distances. Therefore, it is normally believed only
% suitable for small data sets. However, other advanced approaches, such as
% kd-tree and delaunary become inefficient when d is large comparing to the
% number of data points. On the other hand, the linear search in MATLAB is
% relatively insensitive to d due to the vectorization. In this code, the
% efficiency of linear search is further improved by using the JIT
% aceeleration of MATLAB. Numerical example shows that its performance is
% comparable with kd-tree algorithm in mex.
%
% See also, kdtree, nnsearch, delaunary, dsearch
% By Yi Cao at Cranfield University on 25 March 2008
% Example 1: small data sets
%{
R=randn(100,2);
Q=randn(3,2);
idx=knnsearch(Q,R);
plot(R(:,1),R(:,2),'b.',Q(:,1),Q(:,2),'ro',R(idx,1),R(idx,2),'gx');
%}
% Example 2: ten nearest points to [0 0]
%{
R=rand(100,2);
Q=[0 0];
K=10;
idx=knnsearch(Q,R,10);
r=max(sqrt(sum(R(idx,:).^2,2)));
theta=0:0.01:pi/2;
x=r*cos(theta);
y=r*sin(theta);
plot(R(:,1),R(:,2),'b.',Q(:,1),Q(:,2),'co',R(idx,1),R(idx,2),'gx',x,y,'r-','linewidth',2);
%}
% Example 3: cputime comparion with delaunay+dsearch I, a few to look up
%{
R=randn(10000,4);
Q=randn(500,4);
t0=cputime;
idx=knnsearch(Q,R);
t1=cputime;
T=delaunayn(R);
idx1=dsearchn(R,T,Q);
t2=cputime;
fprintf('Are both indices the same? %d\n',isequal(idx,idx1));
fprintf('CPU time for knnsearch = %g\n',t1-t0);
fprintf('CPU time for delaunay = %g\n',t2-t1);
%}
% Example 4: cputime comparion with delaunay+dsearch II, lots to look up
%{
Q=randn(10000,4);
R=randn(500,4);
t0=cputime;
idx=knnsearch(Q,R);
t1=cputime;
T=delaunayn(R);
idx1=dsearchn(R,T,Q);
t2=cputime;
fprintf('Are both indices the same? %d\n',isequal(idx,idx1));
fprintf('CPU time for knnsearch = %g\n',t1-t0);
fprintf('CPU time for delaunay = %g\n',t2-t1);
%}
% Example 5: cputime comparion with kd-tree by Steven Michael (mex file)
% <a href="http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=7030&objectType=file">kd-tree by Steven Michael</a>
%{
Q=randn(10000,10);
R=randn(500,10);
t0=cputime;
idx=knnsearch(Q,R);
t1=cputime;
tree=kdtree(R);
idx1=kdtree_closestpoint(tree,Q);
t2=cputime;
fprintf('Are both indices the same? %d\n',isequal(idx,idx1));
fprintf('CPU time for knnsearch = %g\n',t1-t0);
fprintf('CPU time for delaunay = %g\n',t2-t1);
%}
% Check inputs
[Q,R,K,fident] = parseinputs(varargin{:});
% Check outputs
error(nargoutchk(0,2,nargout));
% C2 = sum(C.*C,2)';
[N,M] = size(Q);
L=size(R,1);
idx = zeros(N,K);
D = idx;
if K==1
% Loop for each query point
for k=1:N
d=zeros(L,1);
for t=1:M
d=d+(R(:,t)-Q(k,t)).^2;
end
if fident
d(k)=inf;
end
[D(k),idx(k)]=min(d);
end
else
for k=1:N
d=zeros(L,1);
for t=1:M
d=d+(R(:,t)-Q(k,t)).^2;
end
if fident
d(k)=inf;
end
[s,t]=sort(d);
idx(k,:)=t(1:K);
D(k,:)=s(1:K);
end
end
if nargout>1
D=sqrt(D);
end
function [Q,R,K,fident] = parseinputs(varargin)
% Check input and output
error(nargchk(1,3,nargin));
Q=varargin{1};
if nargin<2
R=Q;
fident = true;
else
fident = false;
R=varargin{2};
end
if isempty(R)
fident = true;
R=Q;
end
if ~fident
fident = isequal(Q,R);
end
if nargin<3
K=1;
else
K=varargin{3};
end
后面的那个用RANSAC的暂时先不看了。。。有机会再来补上