PSMNet代码的一些理解

最新推荐文章于 2024-06-27 06:00:00 发布

learning_always

最新推荐文章于 2024-06-27 06:00:00 发布

阅读量4.8k

点赞数 7

分类专栏：论文复现文章标签：自动驾驶深度学习 pytorch

本文链接：https://blog.csdn.net/learninging_csdn/article/details/88997983

版权

本文介绍了PSMNet的代码实现过程，包括数据预处理、网络模型、训练和预测部分。数据预处理涉及图像裁剪和标准化，网络模型采用基本模块和堆叠小时glass结构生成并调整代价卷。在训练阶段注意batch size，预测时可得到黑白色调的视差图。代码清晰，适合新手尝试。

摘要由CSDN通过智能技术生成

作为入坑进深度学习的小白，第一篇复现的论文是《Pyramid Stereo Matching Network》,代码已经由作者开源，链接：https://github.com/JiaRenChang/PSMNet
代码大致读懂，将一些代码po出，做简单注释。代码、注释、下面的备注要结合看哦。代码只针对KITTI2015，其他训练集没有用到。

数据预处理

讲一下KITT2015的预处理部分。
dataloader/KITTIloader2015.py

class myImageFloder(data.Dataset):
    def __init__(self, left, right, left_disparity, training, loader=default_loader, dploader= disparity_loader):
        self.left = left
        self.right = right

        self.disp_L = left_disparity
        self.loader = loader
        self.dploader = dploader
        self.training = training

    def __getitem__(self, index):
        left  = self.left[index]
        right = self.right[index]
        disp_L= self.disp_L[index]

        left_img = self.loader(left)
        right_img = self.loader(right)
        dataL = self.dploader(disp_L)


        if self.training:

           w, h = left_img.size
           th, tw = 256, 512
 
           x1 = random.randint(0, w - tw)
           y1 = random.randint(0, h - th)

           left_img = left_img.crop((x1, y1, x1 + tw, y1 + th))

           right_img = right_img.crop((x1, y1, x1 + tw, y1 + th))
           left_img = np.array(left_img, dtype=np.uint8)
           right_img = np.array(right_img, dtype=np.uint8)

           dataL = np.ascontiguousarray(dataL,dtype=np.float32)/256
           dataL = dataL[y1:y1 + th, x1:x1 + tw]
           processed = preprocess.get_transform(augment=False)
           left_img   = processed(left_img)
           right_img  = processed(right_img)
           return left_img, right_img, dataL
        else:

           """
           w, h = left_img.size

           left_img = left_img.crop((w - 1232, h - 368, w, h))
           right_img = right_img.crop((w - 1232, h - 368, w, h))
           #w1, h1 = left_img.size

           dataL = dataL.crop((w - 1232, h - 368, w, h))
           dataL = np.ascontiguousarray(dataL, dtype=np.float32)/ 256

           processed = preprocess.get_transform(augment=False)
           left_img = processed(left_img)
           right_img = processed(right_img)
           """
           w, h = left_img.size
           th, tw = 256, 512

           x1 = random.randint(0, w - tw)
           y1 = random.randint(0, h - th)

最低0.47元/天解锁文章

learning_always

关注

7
点赞
踩
63

收藏

觉得还不错? 一键收藏
47
评论
PSMNet代码的一些理解

作为入坑进深度学习的小白，第一篇复现的论文是《Pyramid Stereo Matching Network》,代码已经由作者开源，链接：https://github.com/JiaRenChang/PSMNet。代码大致读懂，将一些代码po出，做简单注释。代价卷的生成refimg_fea = self.feature_extraction(left)targetimg_fea ...
复制链接

扫一扫

专栏目录