1.EdgeDetection_1.1.DeepEdge

2015 CVPR


1. Single-Scale Architecture

(1) The KNet is an appropriate model for our setting as it has been trained over a large number of object classes (the 1000 categories of the ImageNet dataset) and and thus captures features that are generic and useful for many object categories.  
(2) Its architecture consists of 5 convolutional layers and 3 fully connected layers. we utilize only the first 5 convolutional layers.
(3)  The second convolutional layer seems to encode coherent edge structures. The third convolutional layer fires at locations corresponding to prototypical object shapes. The fourth layer appears to generate high responses for full shapes of the object, whereas the fifth layer fires on the specific object parts.



2. Extraction of High-Level Features

(1) Wconsider a small sub-volume of the feature map stack produced at each layer. The sub-volume is centered at the center of the patch in order to assess the presence of a contour in a small area around the candidate point.

(2) We perform max, average, and center pooling on this sub-volume. We define center pooling as selecting the center-value from each of the feature maps.

(3) Because the candidate point is located at the center of the input patch, center pooling extracts the activation value from the location that corresponds to our candidate point location.


3. Bifurcated Sub-NetWork

(1) We connect the feature maps computed via pooling from the five convolutional layers to two separately-trained network branches. Each branch consists of two fully-connected layers.

(2) The first branch is trained using binary labels to perform contour classification. This branch is making less selective predictions by classifying whether a given point is a contour or not.

(3) The second branch is optimized as a regressor to predict the fraction of human labelers agreeing about the contour presence at a particular point. It is trained to learn the structural differences between the contours that are marked by a different fraction of human labelers.

(4) At testing time, the scalar outputs computed from these two sub-networks are averaged to produce a final score indicative of the probability that the candidate point is a contour.

  

4. Other parts

(1)  Loss function:  Both branches optimize the sum of squared difference   loss over the (binary or continuous) labels.
(2) Training data

    Binary labels: we first sample 40000 positive examples that were marked as contours by at least one of the labelers.

    Negative examples: we consider the points that were selected as candidate contour points by the Canny edge detector but that have not been marked as contours by any of the human labelers.

    Regression labels: the fraction of human labelers that marked the point as a contour.


5. MultiScale Architecture

(1) We extract patches around the candidate point for different patch sizes so that they cover different spatial extents of the image. We then resize the patches to fit the KNet input and pump them in parallel through the five convolutional layers.

(2) The sizes of patches are 64*64128*128196*196 and a full-sized image. All of the patches are then resized to the KNet input dimensions of 227*227.

(3) We use sub-volumes of convolutional feature maps having spatial sizes 7*7, 5*5, 3*3, 3*3, and 3*3 for the convolutional layers 1, 2, 3, 4, 5. Our choice of sub-volume sizes is made to ensure we are roughly considering the same spatial extent of the original image at each layer.

  • 0
    点赞
  • 0
    收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

你的涛哥哥

你的鼓励将是我创作的最大动力

¥2 ¥4 ¥6 ¥10 ¥20
输入1-500的整数
余额支付 (余额:-- )
扫码支付
扫码支付:¥2
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、C币套餐、付费专栏及课程。

余额充值