Paper notes:
this paper focused on black-box and untargeted threat model.
the attack problem:
pre-requsite: loss funtion L(x,y), distance metric l-p norm.
non-convexity but well solved by previous attack projected gradient descent PGD.
PGD:
iterative, given input x, correct label y, run k steps with x0=x
description in paper: "ΠS is the projection onto the set S, Bp(x′,ε′) is the lp ball of radius ε′ around x′, η is the step size, and ∂U is the boundary of a set U.Also, as is standard in continuous optimization, we make sl be the projection of the gradient ∇xL(xl−1,y) at xl−1 onto the unit lp ball. This way we ensure that sl corresponds to the unit lp-norm vector that has the largest inner product with ∇xL(xl−1,y). (Note that, in the case of the l2-norm, sl is simply the normalized gradient but in the case of, e.g., the l∞-norm, sl corresponds to the sign vector, sgn (∇xL(xl−1, y)) of the gradient.)"
in black-box setting can also estimate the directional derivative
and then construct an estimate of gradient
using PGD on this estimated gradient:
ZOO first used this method but high query complexity proportiinal to dimension.
then imperfet gradient estimators also enough generate adv examples.
(owing to draft save problem, the following part is omitted much)
time-dependent prior:
data-dependent prior:
"whenever two coordinates (i, j) and (k, l) of ∇xL(x, y) are close, we expect ∇xL(x, y)ij ≈ ∇xL(x, y)kl too"
using bandit optimization approach
Strengths:
1.prove the optimality of a standard least-squares estimator.
2.consider prior knowledge time-dependent and data-dependent prior
3.using bandit optimization approach to overperform
Detailed comments, possible improvements, or related ideas:
1. any other prior knowledge useful to gradient estimator
2. any other better method for integrating priors
3. how defense optimization-based black-box attack