Based on YOLO[1]
Potential applications: instance segmentation and occlusion handling
Movitation:
Current object detection approaches predict bounding boxes that provide little instance-specific information beyond location, scale and aspect ratio.
当前的检测方法预测的边界框提供的实例特定信息很少超过位置、比例和纵横比。
A bounding box captures no more than the location, scale and aspect ratio of an object, a fairly coarse representation that provides no information about the object’s boundary. On the other hand, bottom-up pixel-labelling approaches have no explicit notion of lo- cal and global object shape.
边界框只捕获对象的位置、比例和纵横比,这是一种相当粗糙的表示,不提供有关对象边界的信息。另一方面,自下而上的像素标记方法没有明确的局部和全局对象形状的概念。
To regress directly to objects’ shapes in addition to their bounding boxes and categories
先回顾下YOLO[1]
每个grid cell输出B个bounding box信息(5个数据值,分别是x,y,w,h,和confidence),以及C个物体属于某种类别的概率信息。
其中,x,y是bb中心位置相对于当前grid位置的偏移值,归一化。confidence反映当前bounding box是否包含物体以及物体位置的准确性,计算方式如下:confidence = P(object) * IOU, 其中,P(object) = 1或0.
YOLO网络最终的全连接层的输出维度是 S*S*(B*5 + C)
loss funcition:
Method:
重点:learning of a compact and decodable embedding space in which shapes can be described and compared
Deep regression network
extend YOLO to regress to object locations, confidence scores and conditional probabilities for each category, and detailed shape encodings
Each grid cell also predicts a conditional probability mass function, : p(c|o) ∗ p(o) ∗ IoU = p(c) ∗ IoU.
输出为S×S×(N×B+|C|),其中N= 1+4+256,shape is to be encoded by a 16 × 16 binary mask
loss function:
Decodable Shape Representation
learning of a compact and decodable embedding space in which shapes can be described and compared.
提出两种 hand-crafted representations(Downsampled binary shape masks 和 Radial representation) 和一种学习的 shape representation (denoising auto-encoder)
经过实验证明可学习的形状表征比较好
附加:
radial vectors (d = 256) represents a shape as a series of distances between some centre point within the shape and points deterministically distributed over the shape’s bound- ary. We choose the boundary points by finding where rays cast outwards from the centre point at angles uniformly dis- tributed over [0,2π) intersect the boundary.
参考:
[1] YOLO详解 - 知乎