单目深度估计论文(6）— Unsupervised learning of depth and ego-motion from video (CVPR 2017)

XiaoMin@

已于 2022-05-10 16:38:32 修改

阅读量701

点赞数 1

分类专栏：深度估计文章标签：深度学习计算机视觉机器学习

于 2022-05-02 22:24:25 首次发布

本文链接：https://blog.csdn.net/smallEngineer/article/details/124548103

版权

深度估计专栏收录该内容

7 篇文章 0 订阅

订阅专栏

利用视频进行无监督单目深度估计
[1] Zhou T, Brown M, Snavely N, et al. Unsupervised learning of depth and ego-motion from video[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1851-1858.

思路：
下图可以看到整个pipeline。整体架构有两部分：深度网络(depth cnn)和位姿网络(pose cnn)。
1）Depth CNN(输入是单目视角的视频某帧)
2）Pose CNN（视频连续帧的不同视角）
3）Loss (将当前帧图像结合预测的深度图以及帧间转移投影到临近帧上，计算像素误差作为训练的 loss，对两个网络进行联合训练)
在这里插入图片描述
取中间一帧的图像作为 Depth CNN 的输入，输出此帧的预测深度图；取前后相邻帧作为 Pose CNN 的输入，输出对相机运动的位姿预测；将预测的深度图映射到运动轨迹得到对原中间帧的预测图，将预测帧与原帧的差别作为损失函数。无监督地估计深度的网络采用和DispNet相似的架构，输入是某一单帧It，输出其对应的深度图估计pose的网络的输入是一张目标图像，以及其（时间上）附近的图像序列Is(s=t-1,t+1,t-2,t+2,…)，输出目标图像到附近的这些图像的位姿（欧拉角和位移（pose共6个自由度））。
利用以上得到的深度和位姿，将图像It上的图像块warp到Is，并通过卷积网络来最小化It与Is的总的光度误差，得到最终的pose。
在这里插入图片描述

XiaoMin@

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
单目深度估计论文(6）— Unsupervised learning of depth and ego-motion from video (CVPR 2017)

利用视频进行无监督单目深度估计[1] Zhou T, Brown M, Snavely N, et al. Unsupervised learning of depth and ego-motion from video[C]. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1851-1858.
复制链接

扫一扫

专栏目录