【论文回顾】The Limitations of Deep Learning in Adversarial Settings

最新推荐文章于 2022-07-14 22:06:06 发布

Sadvine

最新推荐文章于 2022-07-14 22:06:06 发布

阅读量481

点赞数

分类专栏：深度学习文章标签：对抗攻击深度学习安全

本文链接：https://blog.csdn.net/chen_yi_ang/article/details/103936398

版权

深度学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

The Limitations of Deep Learning in Adversarial Settings

paper notes:

This paper introduces the background of adversarial examples including adversarial goals, capabilities and then explains how to generate adversarial examples by forward gradient: just take the derivative of network on input features. that is,

another picture helps to understand:

The general algorithm they proposed:

In this paper, saliency map is soptlight. They induced this based on the forward derivative and help us understand the existing of adv examples and they succeed both by increasing or decreasing pixel intensities.

they found decreasing is more less successful because it reduces information entropy and makes harder to extract information by dnn to classify.

In evaluation, they study class pair(source-target) and found that there exists some pairs are harder. they do the hardness measure (measure of quantifying the distance between two classes) and adversarial distance (predictive measure from adversarial saliency maps). At last, they study the human perception of adversarial samples.

Strengths:

1.reducing the distortion (L0: the number of features altered)

2.induce the adversarial saliency map

3.mitigate the adversarial examples: measure hardness and adversarial distance.

Detailed comments, possible improvements, or related ideas:

1.defense is possible by evaluating the regularity of examples. for example, the squared difference between each pair of neighbouring pixels is always higher for adversarial examples than for benign examples.