论文速看 Photo-Sketching: Inferring Contour Drawings from Images

Photo-Sketching: Inferring Contour Drawings from Images

在这里插入图片描述

Year: 2019
Paper link
Github link
Project link

简述

  1. 本文任务近似于边缘检测任务,从真实图像中提取出轮廓线(即Contour Drawing);同时也可以通过从迁移任务理解,将真实图像转换为轮廓线草图。不同于边缘检测任务,本文是通过在数据集上学习的方法,故而网络学习到了一些数据集中人类的一些关注能力,不向边缘检测任务只是关注pixel的变化,而是有选择的关注图像中的主体部分。这一比较可以查看下面Figure 1。
  2. 收集了一个5000张图像的数据集。 具体方式是首先从Adobe Stock网站找了1000张真实的户外图像,然后去Amazon Mechanical Turk众包平台,让劳动力对每一张真实图像描绘轮廓图。一张真实图像对应5张轮廓图
  3. 采用cGAN架构。首先加入任务损失,即由真实图像转换得到的草图与ground truths的L1损失。其次针对数据集的这种一对多情况,将 ( x i , y i 1 , y i 2 , … , y i M i ) (x_i, y_i^{1},y_i^{2},\dots,y_i^{M_i}) (xi,yi1,yi2,,yiMi)认为是一个样本,提出MM(Mean+Min)损失,即(1)平均每一组 D ( x i , y i ) D(x_i,y_i) D(xi,yi), (2)只取 m i n L ( x i , y i ) min L(x_i,y_i) minL(xi,yi)
  4. 本文得到的模型有一定的泛化能力,可以直接拿本文训练之后的模型当做智能的边缘检测用。
  5. Respect for the Authors, 尽管我感觉这篇有点hurry了,但是确实比边缘检测好用,which i think is very meaningful and worthful。
  6. 指标分析、本文可以fine-tune为边缘检测(包含与边缘检测算法的指标对比)、The Sketch Game(作者提出的可以用于数据收集的小游戏)见论文。

Abstract

In this paper, we aim to generate contour drawings, boundary-like drawings that capture the outline of the visual scene.
Prior art often cast this problem as boundary detection. However,the set of visual cues presented in the boundary detection output are different from the ones in contour drawings, and also the artistic style is ignored.
Contour drawing might be a scalable alternative to boundary annotation, which at the same time is easier and more interesting for annotators to draw.

We address these issues by collecting a new dataset of contour drawings and proposing a learning-based method that resolves diversity in the annotation and, unlike boundary detectors, can work with imperfect alignment of the annotation and the actual ground truth.

Introduction

Contour drawing contains object boundaries, salient inner edges such as occluding contours, and salient background edges.
Comparing to image boundaries, contour drawings tend to have more details inside each object (including occluding contours and semantically-salient features such as eyes, mouths, etc.) and are made of strokes that are loosely aligned to pixels on the image edges.
在这里插入图片描述
Another element involved contour drawing generation is to adopt proper artistic style. Fig 1 shows our method successfully captures the style and itself is a style transfer application.
在这里插入图片描述

Contribution

  • we collect a dataset containing 5000 drawings
  • The challenge for training a contour generator is to resolve the diversity among the contours for the same image obtained from multiple annotators. We address it by proposing a novel loss that allows the network to converge to an implicit consensus, while retaining details.
  • Our contour generator can be applied to salient boundary detection. By simply fine-tuning on BSDS500, we achieve the state-of-the-art performance.
  • Finally, we show our dataset can be expanded in a cost free way with a sketch game.

Dataset

Finally, we collect 5000 high-quality drawings on a dataset of 1000 outdoor images crawled from Adobe Stock [1] and each image is paired with exactly 5 drawings.
在这里插入图片描述
在这里插入图片描述

Method

A unique aspect of our problem here is that each training image is associated with multiple ground truth sketches drawn by different annotators. And our contour drawings on average contain 44 strokes and around 5,000 control points.
Naturally, the problem of generating contour drawing can be cast into an image translation problem or a classical boundary detection problem.
In this work, we use a different cGAN with a novel MM-loss .
cGAN Loss:
在这里插入图片描述
As found by previous work [38, 23], the noise vector z is usually ignored in the optimization. Therefore, we do not include z in the our experiments.
We also followed the common approach in cGAN to include a task loss in addition to the GAN loss.
For our contour generation task, we set the task loss to be L1 loss which encourages sparsity required for contour outputs.
在这里插入图片描述
The above formulation assumes a 1-to-1 mapping between the two domains.
However, we have multiple different targets y i 1 , y i 2 , … , y i M i y_i^{1},y_i^{2},\dots,y_i^{M_i} yi1,yi2,,yiMi for a same input x i x_i xi , making it a 1-to-many mapped problem.
Our method treats ( x i , y i 1 , y i 2 , … , y i M i ) (x_i, y_i^{1},y_i^{2},\dots,y_i^{M_i}) (xi,yi1,yi2,,yiMi) as a single training example.
To accommodate the extra targets in each training example, we propose a novel MM-loss (Min-Mean-loss). Two different aggregate functions are used for the generator G and the discriminator D respectively.
在这里插入图片描述
The “mean” aggregate function asks the discriminator to learn from all modalities in the target domain and treat those modalities with equal importance. The “min” aggregate function allows the generator to adaptively pick the most suitable modality to generate on-the-fly. The “min” aggregation function might be reminiscent of the stochastic multiple-choice loss [30] which relies on a single target output but learns multiple network output branches to generate a diverse output.
在这里插入图片描述

Results

在这里插入图片描述

In-the-Wild Testing

The qualitative results in Fig1 show that our model has learned general representations for salient contours in the images without content bias and incorporates random perturbations present in human drawings.
The generalization power to unseen contents suggests that our method can be applied to other tasks, for instance, salient boundary detection, which is discussed in the following section.

Things do not know

salient boundary detection 显著边界检测是指在图像处理和计算机视觉中,识别和定位图像中显著的边界或边缘的过程。这些边界通常表示了图像中不同区域之间的显著变化,比如颜色、纹理、亮度等方面的变化。显著边界检测对于图像分割、对象识别和场景理解等任务非常重要。常见的技术包括基于梯度、纹理、颜色等特征的方法,以及基于深度学习的方法。

Words do not know

salient 突出的、显著的、重要的。在某个上下文中,这个词通常用来描述某个特征或信息在整体中占据显著地位或引人注目的程度
indicative 指示性的,表示性的,具有指示作用的
annotator 标注者、注释者
annotation 注释或标注,通常指对文本、图像、音频等数据进行解释、说明或标记的过程。
occluding 遮挡的、阻塞的。在图像处理或计算机视觉中,这个词通常用来描述某个物体或区域对另一个物体或区域的遮挡或阻挡。
cusp 尖峰、尖端或尖点。在数学中,它还表示曲线或曲面的尖点或尖端。在其他上下文中,“cusp"也可以表示转折点、边缘或突然变化的地方。
accommodate “容纳"或"适应”。在不同的上下文中,这个词可以表示提供足够的空间、时间、资源等,以满足某人或某事物的需要。
consensus “一致意见"或"共识”。这个词通常指的是在一个团体或群体中达成的普遍共同意见或一致的看法。
implicit"隐含的”、“含蓄的"或"暗示的”。这个词通常用来描述不明确表达或暗中包含的概念、观点或意义。
out-of-the-box 意思是"开箱即用"或"无需额外设置即可立即使用"。这个短语通常用来描述产品或解决方案,表示其具有简单易用、无需复杂配置或定制即可立即投入使用的特点。
invalid 是"无效的"或"不合法的"。这个词通常用来描述某种规则、约定或状态不符合预期或不符合要求的情况
on-the-fly 通常用来描述事物在进行中或即时发生的情况。它的中文含义是"即时"、“实时"或"在进行中”。这个短语常用于描述即时产生或即时完成某项任务或活动的过程。

  • 15
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值