RANSAC-FLOW代码的理解

最新推荐文章于 2024-04-20 09:44:18 发布

朱小丰

最新推荐文章于 2024-04-20 09:44:18 发布

阅读量1.2k

点赞数 2

分类专栏：图片匹配文章标签：光流匹配 RANSAC RANSAC-FLOW 深度学习

本文链接：https://blog.csdn.net/weixin_41866216/article/details/107973349

版权

图片匹配专栏收录该内容

2 篇文章 0 订阅

订阅专栏

主要说明demo.py文件的理解，即输入img_source和img_target，使用训练好的模型权重，输出粗匹配结果和神经网络预测的光流匹配结果。

img_source 和 img_target 第三张图片是各区0.5权重加权叠加的结果。

网络分为4部分。netFeatCoarse 使用全卷积进行图特征提取，8倍下采样。netCorr计算相关性，netFlowCoarse预测光流匹配。

netMatch用于计算网络循环损失，训练涉及，这里不涉及。

一、初始化粗匹配模块

参数：

#7 scales, setting ransac parameters
nbScale = 7 #对img-source 进行7个尺度的特征提取
coarseIter = 10000 #RANSAC 算法迭代次数
coarsetolerance = 0.05 #RANSAC 算法参数
minSize = 400 # 图片最小size
imageNet = True # we can also use MOCO feature here
scaleR = 1.2 #用于计算尺度缩放比例

coarseModel = CoarseAlign(nbScale, coarseIter, coarsetolerance, 'Homography', minSize, 1, True, imageNet, scaleR)

初始化：

1使用ResNet-50预训练网络进行特征提取，这里使用到的层到layer3（有些红字是这个代码的显示错误）

2特征处理设定

3获取多尺度缩放比例list

1.2 就是将原图缩放到1.2倍。

二对img_source进行多尺度特征提取

i1指代img_source。

1对img_s进行多尺度缩放

 IsList = []
            for i in range(len(self.scaleList)):
                #根据self.scaleList 进行缩放
                #400*self.scaleList[i])
                IsList.append(self.ResizeMaxSize(Is_org, int(self.minSize * self.scaleList[i])))
                #IsList newresize

    def ResizeMaxSize(self, I, minSize) :
        # -----------------------#
        #minSize = self.minSize * self.scaleList[i]
        #print('minSize')
        #print(minSize)
        #一次进入一个
        #480 453 426 400 377 355 333
        # -----------------------#
        #first minSize = 480
        #img_I w h 819 614
        #ratio = max 819/480 614/480 = 1.7
        #new_w, new_h = 819/1.7 614/1.7 = 480 352
        #480//16*16 整除16向下取整 再还原 确保是16整数倍
        #ratioW  ratioH = 480/819 352/614 = 0.5860805860805861 0.5732899022801303
        #img_I resize -> 480 352 #PIL.Image.LANCZOS（高质量下采样滤波器）
        w, h = I.size # 819 614  # minSize 480
        ratio = max(w / float(minSize), h / float(minSize)) # 1.7
        new_w, new_h = int(round(w/ ratio)), int(round(h / ratio))
        new_w, new_h = new_w // self.strideNet * self.strideNet , new_h // self.strideNet * self.strideNet #取整除 #16 2^4
        ratioW, ratioH = new_w / float(w), new_h / float(h) # new / orig ratio
        Iresize = I.resize((new_w, new_h), resample=Image.LANCZOS)
        
        return Iresize

缩放后的图片list

2取出缩放比例=1 的图片。

    self.Is = IsList[len(self.scaleList) // 2]
            # -----------------------#
            # print('self.Is')
            # print(self.Is)
            #self.scaleList = 1 对应的图片
            #<PIL.Image.Image image mode=RGB size=400x288 at 0x122023E1F08>
            # -----------------------#

            # the number of new WH pairs
            self.IsTensor = self.toTensor(self.Is).unsqueeze(0).cuda()
            # -----------------------#
            # print('self.IsTensort.shape')
            # print(self.IsTensor.shape)
            #torch.Size([1, 3, 288, 400])
            # -----------------------#

3对多尺度的图片进行特征提取出，这个过程下采样率是8倍。

            for i in range(len(self.scaleList)) :

                # IsList[i] -> ToTensor() and normalize -> unsqueeze(0) 1 3 w h
                # -> self.net - F.normalize l2正则化

                feat = F.normalize(self.net(self.preproc(IsList[i]).unsqueeze(0).cuda()))
                # -----------------------#
                #print(IsList[i])
                #print('feat')
                # print(feat.shape)
                # torch.Size([1, 1024, 22, 30])
                # torch.Size([1, 1024, 21, 28])
                # torch.Size([1, 1024, 19, 26])
                # torch.Size([1, 1024, 18, 25])
                # torch.Size([1, 1024, 17, 23])
                # torch.Size([1, 1024, 16, 22])
                # torch.Size([1, 1024, 15, 20])
                # -----------------------#

4获取X方向和Y方向的index

获取X方向和Y方向的索引，+0.5之后缩放到[-1,1]之间。

5把feature在尺度上进行拉伸成w*h ，1024代码channel，batch=1.

三对img_target进行特征提取

和上面基本一致，只对img_target进行处理。

    def setTarget(self, It_org) : 
        with torch.no_grad() : 
            self.It = self.ResizeMaxSize(It_org, self.minSize)
            self.ItTensor = self.toTensor(self.It).unsqueeze(0).cuda() # 1 3 288 400
            self.featt = F.normalize(self.net(self.preproc(self.It).unsqueeze(0).cuda()))
            self.Wt, self.Ht = outil.getWHTensor(self.featt)
            # -----------------------#
            #print('featt s', self.featt.shape)
            # torch.Size([1, 1024, 18, 25])
            # -----------------------#
            # print('self.Wt.shape and self.Ht.shape')
            # print(self.Wt.shape, self.Ht.shape)
            #torch.Size([450]) torch.Size([450])
            # -----------------------#

四对ItTensor进行特征提取得到featt，featt在预测光流匹配时使用。

五初始化光流匹配的grid用于之后计算求得光流匹配结果

六使用getCoarse获取变换的参数矩阵

1获取余弦相识度最大的坐标的索引

1计算两个feature的余弦相似度的到score矩阵

2计算两个维度的top1

3把两个温度余弦最大的值结合得到匹配的点。

4取得index

def mutualMatching(featA, featB) :
    # -----------------------#
    # print('featA.shape', featA.shape) #torch.Size([1024, 3235])
    # print('featB.shape', featB.shape) #torch.Size([1024, 450])
    # -----------------------#
    score = torch.mm(featA.transpose(0, 1), featB) #nbA * nbB 矩阵相乘 余弦相似度
    # -----------------------#
    # print('score.shape', score.shape) #torch.Size([3235, 450])
    # -----------------------#

    maxDim0, maxDim0Index = score.topk(k=1, dim = 0) # 1 * nbB
    maxDim1, maxDim1Index = score.topk(k=1, dim = 1) # nbA * 1
    # -----------------------#
    # maxDim0, maxDim0Index torch.Size([1, 450])
    # maxDim1, maxDim1Index torch.Size([3235, 1])
    # -----------------------#
    keepMaxDim0 = torch.zeros((featA.size(1), featB.size(1)), device=featA.device).scatter_(0, maxDim0Index, maxDim0)
    keepMaxDim1 = torch.zeros((featA.size(1), featB.size(1)), device=featA.device).scatter_(1, maxDim1Index, maxDim1)
    # print('keepMaxDim0', keepMaxDim0.shape) torch.Size([3235, 450])
    # print('keepMaxDim1', keepMaxDim1.shape) torch.Size([3235, 450])

    keepMax = keepMaxDim0 * keepMaxDim1
    keepMaxIndex = (keepMax > 0).nonzero()
    #print(keepMaxIndex)
    index1, index2 = keepMaxIndex[:, 0], keepMaxIndex[:, 1]
    return index1, index2 #x index y index

2match1和match2包含上一步求得的匹配点的index。这步是使用RANSAC求得匹配的点。

返回变换参数矩阵bestParam。和进一步优化的匹配点的index。

七根据求得的bestPara获取粗匹配关系，之后对原图使用粗匹配变换的到粗匹配后的结果。

F.grid_sample是一种采样插值变换方式。

I1_coarse 是粗匹配变换后的feature，把它转换成图片就可以可视化了。

---

Fine Alignment

把粗匹配变换后的feature进行进一步优化。

---

八

把粗匹配变换后的feature放进下采样8倍的网络进行特征提取。

九

输入上面提到的featt和featSample

输出两个feature的相关性

---

先对y进行padding处理，扩大尺度。

y与x长宽对应关系：y_w = x_w + 6, y_h = x_h+6。

之后计算相关性coef

当i=0，j=0时 x与y左上角w*h大小的feature进行乘积计算求得相关性。

遍历后得到49组相关性。

十

把coef输入到netFlowCoarse网络中进一步优化

1网络设定x方向的flowgrid 和 Y方向的flowgrid 形状都是1 49 1 1

网络进行特征提取把相关性特征变成形状1 49 36 50 的feature

3把两个方向的gridflow与x进行乘积最后求得预测的flow。

之后进行线性上采样还原到 1 2 288 400大小

论文说不用反卷积是因为反卷积影响模型性能占用内存。

十一

网络预测的匹配关系flowdown8 -> flowup + 上面的初始化的grid 作为预测的光流匹配结果。

使用F.grid_sample以粗匹配形状进行插值采样计算得到flow12 （这一步是叠加两个匹配？不太理解）

之后对IsTesor进行预测的flow变换得到光流匹配结果。

朱小丰

关注

2
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
RANSAC-FLOW代码的理解

主要说明demo.py文件的理解，即输入img_source和img_target，使用训练好的模型权重，输出粗匹配结果和神经网络预测的光流匹配结果。img_source 和 img_target 第三张图片是各区0.5权重加权叠加的结果。网络分为4部分。netFeatCoarse 使用全卷积进行图特征提取，8倍下采样。netCorr计算相关性，netFlowCoarse预测光流匹配。netMatch用于计算网络循环损失，训练涉及，这里不涉及。一、初始化粗匹配模块参数：#7
复制链接

扫一扫

专栏目录