SIFTFLOW代码解析

最新推荐文章于 2024-08-05 17:39:34 发布

dadaHaHa1234

最新推荐文章于 2024-08-05 17:39:34 发布

阅读量967

点赞数 1

本文链接：https://blog.csdn.net/qq_32425195/article/details/105610378

版权

SIFTFLOW代码解析）

- - - 3.warp:warpI2=warpImage(Im2,vx,vy)
  - 用python_pytorch完成的sift_flow

下面展示一些 内联代码片。

几个关键步骤：
1.Compute the dense SIFT image：Sift1=dense_sift(im1,patchsize,gridspacing);
2.SIFT flow matching：[vx,vy,energylist]=SIFTflowc2f(Sift1,Sift2,SIFTflowpara)
3.warp:warpI2=warpImage(Im2,vx,vy)

dense_sift：
功能：提取一张图片的各个点的sift特征，输入数据为（输入图像，再周围patchsize*patchsize范围提取特征，）
假设输入图像大小：450*416*3，patchsize=8,gridspacing=1
返回结果：sift_arr:444*410*128.因为pathsize边角的不处理，所以sift_arr的H,W减少，128=num_angles*num_bins*num_bins=8*4*4，其中num_bins表示我们再patchSzie中选择几个，如果num_bins=4,patchSize=8，表示采样间隔为2；num_angles表示我们把(-pi，pi)分为几个范围（anglesbin），当num_angles=8时，angle_step = 2 * pi / num_angles=pi/4。sift_arr中每个元素的值是梯度方向与大小的编码（梯度方向为梯度的方向角与anglebin差值的cos，梯度大小作为权重）。 grid_x, grid_y表示我们计算sift_arr在图中对应的位置的坐标。

function [sift_arr, grid_x, grid_y] = dense_sift(I, patch_size, grid_spacing)

I = double(I);
I = mean(I,3);
I = I /max(I(:));

% parameters
num_angles = 8; %把（-pi,pi）的方向角分为8个anglebins
num_bins = 4; %在patchSize中采样4个值
num_samples = num_bins * num_bins;
alpha = 9; %% parameter for attenuation of angles (must be odd)

angle_step = 2 * pi / num_angles; %num_angles = 8 所以angle_step=0.7854
angles = 0:angle_step:2*pi; %[0,0.785398163397448,1.570796326794897,2.356194490192345,3.141592653589793,3.926990816987241,4.712388980384690,5.497787143782138,6.283185307179586]
angles(num_angles+1) = []; % bin centers

[hgt wid] = size(I);

%计算梯度方向和梯度大小，用来编码特征的值
[G_X,G_Y]=gen_dgauss(sigma_edge); %求x方向，y方向边缘的核函数
I_X = filter2(G_X, I, 'same'); % vertical edges 和I大小相同
I_Y = filter2(G_Y, I, 'same'); % horizontal edges
I_mag = sqrt(I_X.^2 + I_Y.^2); % gradient magnitude
I_theta = atan2(I_Y,I_X);%每个点的边缘的方向角，和I尺寸相同
I_theta(find(isnan(I_theta))) = 0; % necessary????

% grid 
grid_x = patch_size/2:grid_spacing:wid-patch_size/2+1;%I尺寸为450*416时，grid_x为4,5,6，....413
grid_y = patch_size/2:grid_spacing:hgt-patch_size/2+1;

% make orientation images
I_orientation = zeros([hgt, wid, num_angles], 'single');%hgt=450, wid=416, num_angles=8

% for each histogram angle
cosI = cos(I_theta);
sinI = sin(I_theta);
%这里计算得到的 I_orientation是该点的方向角与各个bins中心的方向角的差值的cos,然后以该点的梯度的大小作为权重
for a=1:num_angles
    % compute each orientation channel    % cos(I-angle(a))=coscos+sinsin,I是计算角度，abgle(a)是中心角度
    tmp = (cosI*cos(angles(a))+sinI*sin(angles(a))).^alpha;
    tmp = tmp .* (tmp > 0);

    % weight by magnitude
    I_orientation(:,:,a) = tmp .* I_mag;
end

% Convolution formulation:
weight_kernel = zeros(patch_size,patch_size);
r = patch_size/2; %4
cx = r - 0.5; %3.5
sample_res = patch_size/num_bins;%8/4=2
weight_x = abs((1:patch_size) - cx)/sample_res;
weight_x = (1 - weight_x) .* (weight_x <= 1); %[0,0.250000000000000,0.750000000000000,0.750000000000000,0.250000000000000,0,0,0]
%weight_kernel = weight_x' * weight_x;
%这里：因为我们统计像素中心pathcsize*patchsize范围的特征值，所以离中心越近，权重就越大，越远权重越小
for a = 1:num_angles
    %I_orientation(:,:,a) = conv2(I_orientation(:,:,a), weight_kernel, 'same');
    I_orientation(:,:,a) = conv2(weight_x, weight_x', I_orientation(:,:,a), 'same'); %conv2二维卷积,weightx'变为列矩阵,conv2(u,v,A) 首先求 A 的各列与向量 u 的卷积，然后求每行结果与向量 v 的卷积。
end

% Sample SIFT bins at valid locations (without boundary artifacts)
% find coordinates of sample points (bin centers)
[sample_x, sample_y] = meshgrid(linspace(1,patch_size+1,num_bins+1));
sample_x = sample_x(1:num_bins,1:num_bins); sample_x = sample_x(:)-patch_size/2;
sample_y = sample_y(1:num_bins,1:num_bins); sample_y = sample_y(:)-patch_size/2;

sift_arr = zeros([length(grid_y) length(grid_x) num_angles*num_bins*num_bins], 'single');
b = 0;
for n = 1:num_bins*num_bins
    sift_arr(:,:,b+1:b+num_angles) = I_orientation(grid_y+sample_y(n), grid_x+sample_x(n), :);
    b = b+num_angles;
end
clear I_orientation


% Outputs:
[grid_x,grid_y] = meshgrid(grid_x, grid_y);
[nrows, ncols, cols] = size(sift_arr);

% normalize SIFT descriptors
sift_arr = reshape(sift_arr, [nrows*ncols num_angles*num_bins*num_bins]);
sift_arr = normalize_sift(sift_arr);
sift_arr = reshape(sift_arr, [nrows ncols num_angles*num_bins*num_bins]);

2.SIFT flow matching：[vx,vy,energylist]=SIFTflowc2f(Sift1,Sift2,SIFTflowpara)
计算两个位置的关联度，包括3个部分：第一项让对应位置的光流描述符相似第二项限制光流向量的大小第三项让周围的光流向量尽量一样，而不会差距过大


影响速度和精度的参数：
%           nlevels: (4) the number of levels of the Gaussian pyramid
%           topwsize: (10) the size of the matching window at the top level
%           nTopIterations: (100) the number of BP iterations at the top level
%           wsize:   (3) the size of the matching window at lower levels
%           nIterations: (40) the number of BP iterations at lower levels


% prepare the parameters
SIFTflowpara.alpha=2;
SIFTflowpara.d=40;
SIFTflowpara.gamma=0.005;
SIFTflowpara.nlevels=4;
SIFTflowpara.wsize=5;
SIFTflowpara.topwsize=20;
SIFTflowpara.nIterations=60;

[vx,vy,energylist]=SIFTflowc2f(Sift1,Sift2,SIFTflowpara)



function [vx,vy,energylist]=SIFTflowc2f(im1,im2,SIFTflowpara,isdisplay)



% build the Gaussian pyramid for the SIFT images
pyrd(1).im1=im1;
pyrd(1).im2=im2;

for i=2:nlevels %使用coarse-to-fine的策略匹配时，nlevels的大小设置为4.每一层是下一层的2倍下采样
    pyrd(i).im1=imresize(imfilter(pyrd(i-1).im1,fspecial('gaussian',5,0.67),'same','replicate'),0.5,'bicubic');
    pyrd(i).im2=imresize(imfilter(pyrd(i-1).im2,fspecial('gaussian',5,0.67),'same','replicate'),0.5,'bicubic');
end

for i=1:nlevels
    [height,width,nchannels]=size(pyrd(i).im1); %获得该octave的特征图的size
    [height2,width2,nchannels]=size(pyrd(i).im2);
    [xx,yy]=meshgrid(1:width,1:height);    
    pyrd(i).xx=round((xx-1)*(width2-1)/(width-1)+1-xx);
    pyrd(i).yy=round((yy-1)*(height2-1)/(height-1)+1-yy);
end

nIterationArray=round(linspace(nIterations,nIterations*0.6,nlevels)); %60,52,44,36

for i=nlevels:-1:1
    if isdisplay
        fprintf('Level: %d...',i);
    end
    [height,width,nchannels]=size(pyrd(i).im1);
    [height2,width2,nchannels]=size(pyrd(i).im2);
    [xx,yy]=meshgrid(1:width,1:height);

    if i==nlevels % top level
        vx=pyrd(i).xx;
        vy=pyrd(i).yy;
        
        winSizeX=ones(height,width)*topwsize;
        winSizeY=ones(height,width)*topwsize;
    else % lower levels 底层匹配时在自己的对应窗口中匹配
        vx=round(pyrd(i).xx+imresize(vx-pyrd(i+1).xx,[height,width],'bicubic')*2);
        vy=round(pyrd(i).yy+imresize(vy-pyrd(i+1).yy,[height,width],'bicubic')*2);
        
        winSizeX=ones(height,width)*wsize;
        winSizeY=ones(height,width)*wsize;
    end
    if nchannels<=3
        Im1=im2feature(pyrd(i).im1);
        Im2=im2feature(pyrd(i).im2);
    else
        Im1=pyrd(i).im1;
        Im2=pyrd(i).im2;
    end
    %从高向低，计算对应点的距离，flow是偏移向量，foo是对应位置的energy
    if i==nlevels
        [flow,foo]=mexDiscreteFlow(Im1,Im2,[alpha,d,gamma*2^(i-1),nTopIterations,2,topwsize],vx,vy,winSizeX,winSizeY);
    else
        [flow,foo]=mexDiscreteFlow(Im1,Im2,[alpha,d,gamma*2^(i-1),nIterationArray(i),nlevels-i,wsize],vx,vy,winSizeX,winSizeY);
    end
    energylist(i).data=foo;
    vx=flow(:,:,1);
    vy=flow(:,:,2);
    if isdisplay
        fprintf('done!\n');
    end
end

3.warp:warpI2=warpImage(Im2,vx,vy)

根据光流vx,vy把im2做warp

% function to warp images with different dimensions
function [warpI2,mask]=**warpImage**(im,vx,vy)

[height2,width2,nchannels]=size(im); %444*410*3
[height1,width1]=size(vx); %444*410

[xx,yy]=meshgrid(1:width2,1:height2); %生成1-444的行格点和1-410列格点，xx,yy是相同的格点矩阵
[XX,YY]=meshgrid(1:width1,1:height1);
XX=XX+vx; %生成经过光流后对应的坐标
YY=YY+vy;
mask=XX<1 | XX>width2 | YY<1 | YY>height2; %此处的mask标出坐标超限的地方
XX=min(max(XX,1),width2); %对XX中的每个数，如果小于1，或者大于width2，则做一个截断
YY=min(max(YY,1),height2);

for i=1:nchannels
    foo=interp2(xx,yy,im(:,:,i),XX,YY,'bicubic');%原来为im(:,:,i)=f(xx,yy),其中xx,yy是坐标构成的meshgrid,im是图像像素值；XX,YY是插值坐标，返回的foo就是插值坐标对应的图像像素值
    foo(mask)=0.6;%对于没办法插值的坐标，设为0.6
    warpI2(:,:,i)=foo;
end

mask=1-mask;

光流的表示：flowToColor

在这里插入图片描述

vx1=meshgrid(1:100,1:100);
flow(:,:,1)=vx1;
vy1=meshgrid(1:100,1:100);
flow(:,:,2)=vy1;
figure;imshow(flowToColor(flow));title(‘SIFT flow field’);
结果：

设置>> flow(:,:,2)=-vy1;
figure;imshow(flowToColor(flow));title(‘SIFT flow field’);
结果：