fcrn深度图预测的准确率_使用fcrn模型在ios上实现深度估计

fcrn深度图预测的准确率

计算机视觉-iOS (Computer Vision — iOS)

Depth estimation is a major problem in computer vision, particularly for applications related to augmented reality, robotics, and even autonomous cars.

深度估计是计算机视觉中的一个主要问题,特别是对于与增强现实,机器人技术甚至自动驾驶汽车相关的应用。

Traditional 3D sensors typically use stereoscopic vision, movement, or projection of structured light. However, these sensors depend on the environment (sun, texture) or require several peripherals (camera,projector), which leads to very bulky systems.

传统的3D传感器通常使用结构化光的立体视觉,运动或投影。 但是,这些传感器取决于环境(阳光,纹理)或需要多个外围设备(相机,投影仪),这导致系统非常庞大。

Many efforts have been made to build compact systems — perhaps the most remarkable are the light field cameras that use a matrix of microlenses in front of the sensor.

为了构建紧凑的系统,已经做了很多努力-也许最引人注目的是在传感器前面使用微透镜矩阵的光场相机。

Recently, several depth estimation approaches based on deep learning have been proposed. These methods use a single point of view (a single image) and generally optimize a regression on the reference depth map.

最近,已经提出了几种基于深度学习的深度估计方法。 这些方法使用单个视角(单个图像),并且通常优化参考深度图上的回归。

The first challenge concerns the network architecture, which usually follows the advances proposed each year in the field of deep learning: VGG16, residual networks (ResNet), and so on.

第一个挑战涉及网络体系结构,通常遵循每年在深度学习领域提出的进步:VGG16,残差网络(ResNet)等。

The second challenge is defining an appropriate loss function for deep regression. Thus, the relationship between networks and objective functions is complex, and their respective influences are difficult to distinguish.

第二个挑战是为深度回归定义合适的损失函数。 因此,网络和目标函数之间的关系很复杂,并且它们各自的影响难以区分。

Previous methods exploit the geometric aspects of the scene to deduce the depth. Another known index for depth estimation is defocus blur.

先前的方法利用场景的几何方面来推断深度。 深度估计的另一个已知指标是散焦模糊

Image for post
Depth from Defocus method 离焦方法的深度

However, depth estimation using focus blurring (Depth from Defocus, DFD) with a conventional camera and a single image suffers from ambiguity relative to the plane of focus and the blind zone related to the depth of field of the camera, where no blurring can be measured. Furthermore, to estimate the depth of an unknown fuzzy scene, DFD requires a scene model and a fuzzy calibration to relate it to a depth value.

但是,在传统相机和单个图像上使用焦点模糊(来自Defocus的深度,DFD)进行深度估计会产生相对于焦平面和与相机景深有关的盲区的歧义,其中不会出现模糊测量。 此外,为了估计未知模糊场景的深度,DFD需要场景模型和模糊校准以将其与深度值相关联。

为什么要移动? (Why mobile?)

Image for post
Figure 2: True Depth rendering using iPhone X TrueDepth camera
图2:使用iPhone X TrueDepth相机的True Depth渲染

Since the advent of augmented reality, which consists of inserting computer-generated images over real-world scenes using a mobile phone camera or special glasses (i.e Hololens).

增强现实技术问世以来,它包括使用手机摄像头或专用眼镜(即Hololens )将计算机生成的图像插入现实世界场景中。

Small cameras located in the middle and outside of each lens send continuous video images to two small screens on the inside of the glasses.

位于每个透镜中部和外部的小型摄像机将连续的视频图像发送到眼镜内部的两个小屏幕。

Once connected to a computer, the data is combined with live/filmed reality, creating a unique stereoscopic field of view on the LCD screen, where the computer-generated images are superimposed with those of the real world.

连接到计算机后,数据将与现场/拍摄的现实相结合,从而在LCD屏幕上创建独特的立体视场,其中计算机生成的图像与真实世界的图像叠加在一起。

In 2017, Apple had this genius idea to put a depth sensor in the front-facing iPhone camera, mainly to improve security and accuracy for FaceID. Alongside this, they also released the first version of ARKit.

在2017年,Apple有这个天才的想法,在前置iPhone相机中安装了一个深度传感器,主要是为了提高FaceID的安全性和准确性。 除此之外,他们还发布了ARKit的第一个版本。

But unfortunately, the back cameras lacked that feature. Many developers were eager to have the same depth data on the back cameras in order to understand, and even reconstruct, the 3D representation of the world in order to insert digital objects in more immersive and realistic ways.

但不幸的是,后置摄像头缺少该功能。 许多开发人员渴望在后置摄像头上具有相同的深度数据,以了解甚至重建世界的3D表示,从而以更加身临其境且逼真的方式插入数字对象。

For now, the only way we have to get depth data is to try to predict the depth level of a scene using neural networks, and the input can only be a single image.

目前,我们唯一需要获取深度数据的方法就是尝试使用神经网络来预测场景的深度级别,并且输入只能是单个图像。

There’s a lot to consider when starting a mobile machine learning project.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值