The three R’s of computer vision: Recognition, reconstruction andreorganization阅读
CV的三个R:识别 重建 重组
识别:在图片中找到物品
重建:从图建构三维
重组:
instead of the classical separation of vision into low level,mid level and high level vision, it is more fruitful to think of vision as resulting from the interaction of three processes: recognition, reconstruction and reorganization which operate in tandem,and where each provides input to the others and fruitfully exploits their output.
相比于传统的将视觉分割为low,mid high三个层次,更为有效的方式是将vision理解为三个步骤的交互:识别、重建、重组。这三个流程为串联关系且相互提供输入相互利用输出。
Note that the emphasis of this paper is on the relationship between the 3R’s of vision, which is somewhat independent of the(very important) choice of features needed to implement particular algorithms.
本文的重点在于3R之间的关系,这种关系某种层面上独立于实施特定算法需要的对特征的选取(?。
Reorganization →recognition 重组帮助识别
目标检测:传统方法为滑窗法,要求全部object有相同的长宽比(可用混合模型解决);另一种方法如R-CNN,先得到一系列可能的图像区域,再对区域进行过滤找到目标
R-CNNs scale very well with the number of