Adversarially Robust Generalization Requires More Data

最新推荐文章于 2023-12-27 11:50:46 发布

MTandHJ

最新推荐文章于 2023-12-27 11:50:46 发布

阅读量449

点赞数

分类专栏： neural networks

本文链接：https://blog.csdn.net/MTandHJ/article/details/106503395

版权

neural networks 专栏收录该内容

143 篇文章 6 订阅

订阅专栏

文章目录

Schmidt L, Santurkar S, Tsipras D, et al. Adversarially Robust Generalization Requires More Data[C]. neural information processing systems, 2018: 5014-5026.

@article{schmidt2018adversarially,
title={Adversarially Robust Generalization Requires More Data},
author={Schmidt, Ludwig and Santurkar, Shibani and Tsipras, Dimitris and Talwar, Kunal and Madry, Aleksander},
pages={5014–5026},
year={2018}}

概

本文在二分类高斯模型和伯努利模型上分析adversarial, 指出对抗稳定的模型需要更多的数据支撑.

主要内容

高斯模型定义: 令 $\theta^* \in \mathbb{R}^n$ 为均值向量, $\sigma >0$ , 则 $(\theta^*, \sigma)$ -高斯模型按照如下方式定义: 首先从等概率采样标签 $\in \{\pm 1\}$ , 再从 $\mathcal{N}(y \cdot \theta^*, \sigma^2I)$ 中采样 $\in \mathbb{R}^d$ .

伯努利模型定义: 令 $\theta^* \in \{\pm1\}^d$ 为均值向量, $\tau >0$ , 则 $(\theta^*, \tau)$ -伯努利模型按照如下方式定义: 首先等概率采样标签 $\in \{\pm 1\}$ , 在从如下分布中采样 $\in \{\pm 1\}^d$ :
$x_i = \left \{ \begin{array}{rl} y \cdot \theta_i^* & \mathrm{with} \: \mathrm{probability} \: 1/2+\tau \\ -y \cdot \theta_i^* & \mathrm{with} \: \mathrm{probability} \: 1/2-\tau \end{array} \right.$

分类错误定义: 令 $\mathcal{P}: \mathbb{R}^d \times \{\pm 1\} \rightarrow \mathbb{R}$ 为一分布, 则分类器 $f:\mathbb{R}^d \rightarrow \{\pm1\}$ 的分类错误 $\beta$ 定义为 $\beta=\mathbb{P}_{(x, y) \sim \mathcal{P}} [f(x) \not =y]$ .

Robust分类错误定义: 令 $\mathcal{P}: \mathbb{R}^d \times \{\pm 1\} \rightarrow \mathbb{R}$ 为一分布, $\mathcal{B}: \mathbb{R}^d \rightarrow \mathscr{P}(\mathbb{R}^d)$ 为一摄动集合. 则分类器 $f:\mathbb{R}^d \rightarrow \{\pm1\}$ 的 $\mathcal{B}$ -robust 分类错误率 $\beta$ 定义为 $\beta=\mathbb{P}_{(x, y) \sim \mathcal{P}} [\exist x' \in \mathcal{B}(x): f(x') \not = y]$ .

注: 以 $\mathcal{B}_p^{\epsilon}(x)$ 表示 $\{x' \in \mathbb{R}^d|\|x'-x\|_p \le \epsilon\}$ .

高斯模型

upper bound

定理18: 令 $(x_1,y_1),\ldots, (x_n,y_n) \in \mathbb{R}^d \times \{\pm 1\}$ 独立采样于同分布 $(\theta^*, \sigma)$ -高斯模型, 且 $\|\theta^*\|_2=\sqrt{d}$ . 令 $\hat{w}:=\bar{z}/\|\bar{z}\| \in \mathbb{R}^d$ , 其中 $\bar{z}=\frac{1}{n} \sum_{i=1}^n y_ix_i$ . 则至少有 $1-2\exp(-\frac{d}{8(\sigma^2+1)})$ 的概率, 线性分类器 $f_{\hat{w}}$ 的分类错误率至多为:
$\exp (-\frac{(2\sqrt{n}-1)^2d}{2(2\sqrt{n}+4\sigma)^2\sigma^2}).$

定理21: 令 $(x_1,y_1),\ldots, (x_n,y_n) \in \mathbb{R}^d \times \{\pm 1\}$ 独立采样于同分布 $(\theta^*, \sigma)$ -高斯模型, 且 $\|\theta^*\|_2=\sqrt{d}$ . 令 $\hat{w}:=\bar{z}/\|\bar{z}\| \in \mathbb{R}^d$ , 其中 $\bar{z}=\frac{1}{n} \sum_{i=1}^n y_ix_i$ . 如果
$\epsilon \le \frac{2\sqrt{n}-1}{2\sqrt{n}+4\sigma} - \frac{\sigma\sqrt{2\log 1/\beta}}{\sqrt{d}},$

则至少有 $1-2\exp(-\frac{d}{8(\sigma^2+1)})$ 的概率, 线性分类器 $f_{\hat{w}}$ 的 $\ell_{\infty}^{\epsilon}$ -robust 分类错误率至多为 $\beta$ .

lower bound

定理11: 令 $g_n$ 为任意的学习算法, 并且, $\sigma > 0, \epsilon \ge 0$ , 设 $\theta \in \mathbb{R}^d$ 从 $\mathcal{N}(0,I)$ 中采样. 并从 $(\theta,\sigma)$ -高斯模型中采样 $n$ 个样本, 由此可得到分类器 $f_n: \mathbb{R}^d \rightarrow \{\pm 1\}$ . 则分类器关于 $\theta, (y_1,\ldots, y_n), (x_1,\ldots, x_n)$ 的 $\ell_{\infty}^{\epsilon}$ -robust 分类错误率至少为
$\frac{1}{2} \mathbb{P}_{v\sim \mathcal{N}(0, I)} [\sqrt{\frac{n}{\sigma^2+n}} \|v\|_{\infty} \le \epsilon ].$

伯努利模型

upper bound

令 $\in \mathbb{R}^d \times \{\pm1\}$ 从一 $(\theta^*, \tau)$ -伯努利模型中采样得到. 令 $\hat{w}=z / \|z\|_2$ , 其中 $z = y x$ . 则至少有 $\exp (-\frac{\tau^2d}{2})$ 的概率, 线性分类器 $f_{\hat{w}}$ 的分类错误率至多为 $\exp (-2\tau^4d)$ .

lower bound

引理30： 令 $\theta^* \in \{\pm1\}^d$ 并且关于 $(\theta^*, \tau)-伯努利模型$ 考虑线性分类器 $f_{\theta^*}$ ,
$\ell_{\infty}^{\tau}$ -robustness: $f_{\theta^*}$ 的 $\ell_{\infty}^{\tau}$ -robust分类误差率至多为 $2\exp (-\tau^2d/2)$ .
$\ell_{\infty}^{3\tau}$ -nonrobustness: $f_{\theta^*}$ 的 $\ell_{\infty}^{3\tau}$ -robust分类误差率至少为 $1-2\exp (-\tau^2d/2)$ .
Near-optimality of $\theta^*$ : 对于任意的线性分类器, $\ell_{\infty}^{3\tau}$ -robust 分类误差率至少为 $\frac{1}{6}$ .

定理31: 令 $g_n$ 为任一线性分类器学习算法. 假设 $\theta^*$ 均匀采样自 $\{\pm1\}^d$ , 并从 $(\theta^*, \tau)$ -伯努利分布( $\tau \le 1/4$ )中采样 $n$ 个样本, 并借由 $g_n$ 得到线性分类器 $f_{w}$ .同时 $\epsilon < 3\tau$ 且 $\gamma < 1/2$ , 则当
$\le \frac{\epsilon^2\gamma^2}{5000 \cdot \tau^4 \log (4d/\gamma)},$
$f_w$ 关于 $\theta^*, (y_1,\ldots, y_n), (x_1,\ldots, x_n)$ 的期望 $\ell_{\infty}^{\epsilon}$ -robust 分类误差至少为 $\frac{1}{2}-\gamma$ .

MTandHJ

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Adversarially Robust Generalization Requires More Data

文章目录概主要内容高斯模型upper boundlower bound伯努利模型upper boundlower boundSchmidt L, Santurkar S, Tsipras D, et al. Adversarially Robust Generalization Requires More Data[C]. neural information processing systems, 2018: 5014-5026.@article{schmidt2018adversarially,
复制链接

扫一扫

专栏目录